Honourable Monks,
This challenge came from two collegues who know I am trying to learn the power of regular expression by applying them anywhere I can. They had to validate a date in a string with
only a regular expression (and one only), due to environmental/tool contraints. They would be happy if it did not check for long months let alone leapdays, but any sanity check would be welcome. You can't do calculations with regexes, or can you?
Maybe this has been done before, or is even folklore, but to me it was completely new. When I remembered one could lexically establish divisibility by 4, 100 and 400 I started expanding the regex. Now it works. Unfortunately the 'tool' my collegues have to use (teamsite) does not (or they could not find it) support the x-modifier so I had to make one very long line-noise-like string. It too works, though; so they are happy too.
Please have a look at this code and tell me if I am on a good track. I was quite happy with the result, and this is my first message here so please don 't judge me to harsh.
#!/usr/bin/perl -w
#============validdate.pl===================================
# Checks date dd/mm/yyyy
# 4/3/2002 danny@vrijdag.xs4all.nl
print "Check dates of the form dd/mm/yyyy\nq quits.\n-:";
&test;
# just a wrapper:
sub test {
while (<>) {
last if m/(?:^q|bye|end|exit|quit|stop)/i;
chop;
my $val= &validate($_);
print " '$_': $val\n-:";
}
print $_."\n";
}
# The core:
sub validate {
my ($in)=@_;
return "No dd/mm/yyyy" unless m !
^(?: # the start
(?: # A: not leapsensitive or not leap:
(?:0?[1-9]|1\d|2[0-8]) # A1: dd (0)1 - 28
/ # separator (or [\.- ])
(?:0?[1-9]|1[0-2]) # mm (0)1 - 12 (any month)
| # or A2:
(?:29|30) # dd 29 or 30
/ # separator
(?:0?[13-9]|1[0-2]) # mm (0)1,3-12: any month but 2
| # or A3:
31 # 31
/
(?:0?[13578]|1[02]) # a long mm: 1,3,5,7,8,10,12
) # end dd/mm non-leap
/
\d{4} # 4 digits: any yyyy goes in A.
| # B: leapday in a leapyear:
29/0?2/ # that is 29/(0)2 february
(?: # B1:
\d\d # any century and
(?: # divisable
0[48] # by 4 but not by 100
| # (so 04, 08 but not 00)
[2468][048] # even tens:20,24,28,40..
|
[13579][26] # odd tens: 12, 16, 32, 36, 52..
)
| # B2:
(?: # yyyy divisible by 400
[02468][048] # even mill: 00xx, 04xx, 08xx..
|
[13579][26] # odd mill: 12xx, 16xx, 32xx..
)00 # so divisible by 400
) # end leapday
) # end day
$!x; # nothing else, nothing more
return "Ok";
}
Any improving comment is appreciated. Structure, working,
style, readability, maybe there is some nice "
any digit but 2"-syntax I am unaware of - you name it. I know nobody who is into regular expressions.
TIA,
Danny
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.