Regexp problem

agaved has asked for the wisdom of the Perl Monks concerning the following question:

Hi all I want to check if a date is in the form '26-04' or '26-04-12' (both will do).

The divider can be '.' and '_', too.

I used as regexp /\d{2}\-._\d{2}(\-._\d{2})*/ but it doesn't match ... I cannot understand why.

I guess it will be pretty obvious in hindsight, but I have been banging my head on the wall for hours and couldn't get through.

Any help much appreciated.

Comment on Regexp problem

Replies are listed 'Best First'.
Re: Regexp problem by choroba (Cardinal) on Aug 28, 2012 at 13:29 UTC
Please, enclose the the regex in `<code> ... </code>` tags to be readable. Also, please specify what strings the regex does not match but should - it works for me: `perl -E ' $R = qr/\d{2}[-._]\d{2}([-._]\d{2})/; say "$_ ", /$R/ ? "Y" : "N" for qw/1-2 1-20 11.20-1 11.20 12_30.99 + 000_00_000/'` [download] Maybe you miss the anchors? Put `^` at the beginning and `$` at the end of the regex. Also using `?` instead of `` might be desirable not to match strings like `12-12-12-12-12-12-12`. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l] [select]
Re: Regexp problem by philiprbrenan (Monk) on Aug 28, 2012 at 13:29 UTC
Please use [] to separate the separator choices. Please consider using \A and \Z to test the entire input string. I think you meant ? rather than * for the final optional section? '?' means optionally, while * means zero or more. `use feature ":5.14"; use warnings FATAL => qw(all); use strict; use Data::Dump qw(dump); my @d = qw(26-04 26-04-12 26.04 26.04.12 30/8 30/08 30.08/12 1 12 aa a +a/bb aa.c help); say (/\A\d{2}[.-]\d{2}([.-]\d{2})?\Z/ ? "Matches for $_" : "FAILS for +$_" ) for @d;` [download] Produces: Matches for 26-04 Matches for 26-04-12 Matches for 26.04 Matches for 26.04.12 FAILS for 30/8 FAILS for 30/08 FAILS for 30.08/12 FAILS for 1 FAILS for 12 FAILS for aa FAILS for aa/bb FAILS for aa.c FAILS for help	[reply] [d/l]
Re^2: Regexp problem by kcott (Archbishop) on Aug 28, 2012 at 16:24 UTC
"Please consider using \A and \Z to test the entire input string." I agree with this in principle; however, there's a subtle difference between `\Z` (uppercase) and `\z` (lowercase). `/\A ... \z` - matches the entire input string. `/\A ... \Z` - matches the entire input string except for a terminal newline, if it exists. Here's a couple of one-liners to demonstrate this: `$ perl -E 'my $x = qq{qwerty\n}; $re = qr{\Aqwerty\Z}; say +($x =~ /$r +e/) ? 1 : 0;' 1 $ perl -E 'my $x = qq{qwerty\n}; $re = qr{\Aqwerty\z}; say +($x =~ /$r +e/) ? 1 : 0;' 0` [download] See Assertions under perlre - Regular Expressions which has: "To match the actual end of the string and not ignore an optional trailing newline, use `\z`." -- Ken	[reply] [d/l] [select]
Re: Regexp problem by jethro (Monsignor) on Aug 28, 2012 at 13:35 UTC
`perl -e 'if ("26-04"=~/\d{2}[\-._]\d{2}([\-._]\d{2})*/) { print "yes\n +" }' #prints yes` [download] Seems to work. No need to escape the '-' as it would not be used as range character when it is the first character in a character class PS: Use <c>-tags around your regex so that it gets displayed correctly in the browser	[reply] [d/l]
Re: Regexp problem by agaved (Novice) on Aug 28, 2012 at 20:15 UTC
Thanks for the quick answers I was watching this on the debugger and I think I misinterpreted it. So now my question is: why in the debugger I get `x "24-06" =~ /\d{2}-\d{2}/` as 1, but `x "24-06" =~ /\d{2}-\d{2}(p)*/` as undef?	[reply] [d/l] [select]
Re^2: Regexp problem by linuxer (Curate) on Aug 28, 2012 at 20:35 UTC
As far as I understand the debugger, both expressions are evaluated in list context. The first expression simply returns 1 for success and ~~I assume 0~~ an empty list for failure; so it's a single boolean result in list context. The second expression has capturing parantheses, which (in list context) produces a list of captured results. As your `(p)` could not be matched, there is no captured value for `(p)`, so it returns a list with one element: `undef`. That's my attempt to explain it. I am sure there are others who can explain it more detailed and accurately and even show some insight into the internals ... `DB<4> x "24-06" =~ /\d+-\d+/ 0 1 DB<5> x "24-06" =~ /\d+-\d+failure/ empty array` [download] edit:* fixed assumption upon failed regex match, added code example, some rephrasing	[reply] [d/l] [select]

Back to Seekers of Perl Wisdom