you should test your regexp on ",foo," which will work, and shouldn't, as will "!!foo!!,".
It depends on what you think "should work". The OP's original regex was not anchored, and seemed intended to extract matching substrings rather than confirm that an entire string matched.
The /(\w+(?:\,\w+)*)/ regex will successfully extract the matching "foo" substring from your two sample cases into $1. If you want to check the entire string, then yes, leave out the parenthesis and use ^...$ anchors.
Update: with regard to It's probably faster to use 2 regexps too: Yes, a quick Benchmarking shows that, with anchoring, the double-regex style runs about 50% faster than the single-regex solution I posted. (Perhaps one of the resident RegEx gurus can explain why this is?)
However, if you want to extract matching substrings, I think the single regex is a sensible approach.
| [reply] [Watch: Dir/Any] [d/l] [select] |
| [reply] [Watch: Dir/Any] |
Update: with regard to It's probably faster to use 2 regexps too: Yes, a quick Benchmarking shows that, with anchoring, the double-regex style runs about 50% faster than the single-regex solution I posted. (Perhaps one of the resident RegEx gurus can explain why this is?)
I'd be interested to see your benchmark (code + data), as I
don't come to the same conclusion. The benchmark below shows
the one regex solution to be somewhat faster - the data sample
is tiny though.
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw /timethese cmpthese/;
chomp (our @lines = <DATA>);
our (@r1, @r2);
cmpthese -10 => {
one => '@r1 = map {/^\w+(?:,\w+)*$/ ? 1 : 0} @lines
+',
two => '@r2 = map {/^[\w,]+$/ && !/^,|,,|,$/ ? 1 : 0} @lines
+',
};
die "Unequal" unless "@r1" eq "@r2";
__DATA__
one,two,three,four,five
,one,two,three,four,five
one,two,three,four,five,
one,two,three,,four,five
one,two,three four,five
Rate two one
two 25436/s -- -26%
one 34417/s 35% --
Abigail | [reply] [Watch: Dir/Any] [d/l] |
use strict;
use Benchmark 'cmpthese';
my @data = <DATA>;
my @long = map { join '', $_ x 100 } @data;
my %cases = (
'Single' => sub { for ( @long ) { /^\w+(?:,\w+)*$/ } },
'Double' => sub { for ( @long ) { /^[\w,]+$/ && ! /^,|,,|,$/ } },
);
cmpthese( 0, \%cases);
__DATA__
!@#$as3dfa
,sdfas3df,
asd3fsa,,a3sdf
as3df,asdf3,3asdf,asd3f
sad3fasdjasdfkasdfklas3jf
3sad3fasdjasdfkasdfklas3jf
3sad3fasdjasdfkasdfklas3jf3
Rate Single Double
Single 6158/s -- -83%
Double 35319/s 474% -- | [reply] [Watch: Dir/Any] [d/l] |