Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Re: Simple regular expression problem

by polypompholyx (Chaplain)
on Oct 03, 2005 at 13:59 UTC ( #496922=note: print w/replies, xml ) Need Help??

in reply to Simple regular expression problem

Apart from the typos, the reason is that \w+ will gobble up all the word characters in $str, which includes the number at the end. Since your match specifies 'zero or more' numbers at then end, $num gets an empty string. You need to modify the regex to make the \w+ non-greedy, using the ? modifier:

my $str = "abdbdr23"; my ( $name, $num ) = ( $str =~ /^(\w+?)(\d*)$/ );

* is a much less ghastly way of writing {0,}.

Replies are listed 'Best First'.
Re^2: Simple regular expression problem
by Perl Mouse (Chaplain) on Oct 03, 2005 at 14:14 UTC
    It's better to avoid the ? modifier in most cases, as it's less efficient as alternatives. Here's a benchmark:
    #!/usr/bin/perl use strict; use warnings; use Benchmark 'cmpthese'; use Test::More tests => 2; our @data = qw 'foo123 abdbdr23 abcd2 abc 1234 foo!123'; our (@plain, @sticky); my @expected = ([qw 'foo 123'], [qw 'abdbdr 23'], [qw 'abcd 2'], ['abc', ''], ['', 1234], []); cmpthese -1, { plain => '@plain = map {[/^([a-z]*)(\d*)$/]} @data', sticky => '@sticky = map {[/^(\w*?)(\d*)$/]} @data', }; is_deeply \@plain, \@expected; is_deeply \@sticky, \@expected; __END__ 1..2 Rate sticky plain sticky 32582/s -- -17% plain 39385/s 21% -- ok 1 ok 2
    Perl --((8:>*

      Benchmarking is fun. However you should consider your results a little more carefully before making recomendation on them. This would definitly count as a minor optimization at best since we are talking about 32k instead of 40k per second. Which means unless you are are doing 100k's of these compares you are never going to notice the difference. Also interesting is the result of that benchmark on my machine:

      1..2 Rate plain sticky plain 23682/s -- -1% sticky 23904/s 1% -- ok 1 ok 2

      Oddly the difference dropped to mere 100s per second.

      Eric Hodges
        Well, that would be 192K vs. 240K, as the test does 6 regexes per iteration. However, if we would test against the string:
        ('a3' x 100) . '!3'
        the difference would be:
        Rate sticky plain sticky 13273/s -- -95% plain 270490/s 1938% --
        Don't dismiss benchmarks too early as "an insignificant difference".
        Perl --((8:>*
      The OP didn't make it clear whether the string before the number could contain digits. However, it's certainly better to be specific in a regex: if you know (for some value of 'know') something will only contain [A-Za-z], not \w, then the former is probably preferable. On the other hand, [A-Za-z] too often it means "I cannot think of any other letters", and then your script barfs on something perfectly valid, but unexpected, like "Ångström".
      thanx for all the suggestions. I fixed it with \w+? or maybe I use the alpha example!! And now that I understand my mistake, I see that it was all the time already described in the perldoc manual!!

      All your replies are really helpful,
      Thanks a lot!!

Re^2: Simple regular expression problem
by ikegami (Pope) on Oct 03, 2005 at 14:40 UTC

    That's won't work if string contains no digits (which the OP said was a possible input). For example, $name will be just "a" for "abdbdr".

    Update: Just plain wrong.

      Therefore he anchored the regex with $ to match until the end regardless if a digit is present or not
        Shoot, you're right. Monday morning X_X

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://496922]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2018-03-24 22:46 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (299 votes). Check out past polls.