Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Regexp for date

by Anonymous Monk
on Mar 25, 2008 at 02:53 UTC ( [id://676028]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

Does any monks here familiar with the date format? Is any universal regexp available for matching all numerical format (YYYY-MM-DD, YYYY/MM/DD, etc)? (no need to be valid)

Replies are listed 'Best First'.
Re: Regexp for date
by nefigah (Monk) on Mar 25, 2008 at 04:08 UTC

    If you just need a quick regex for matching some uncertainly-delimited numeric dates out of a file or something, (and I take it you are not super familiar with regular expressions) check out: http://www.regular-expressions.info/dates.html

    (Btw, that example matches valid dates. I can't really think of why you wouldn't want them to be valid, since someone else is already writing the expression for you :) )

    Note that that site also has some other useful info about regular expressions.

    If you are doing anything beyond some very basic matching, definitely go after the Date:: modules as mentioned above!


    I'm a peripheral visionary... I can see into the future, but just way off to the side.

Re: Regexp for date
by pc88mxer (Vicar) on Mar 25, 2008 at 05:14 UTC
    For a specific recommendation, I've used the str2time function from Date::Parse module. It parses a myriad of commonly used formats including "YYYY-MM-DD" and "YYYY/MM/DD".

    When parsing dates of the form mm/dd/yyyy, it might be of interest to you to know that different countries can interpret such a date differently. In the US, for instance, the date 03/04/05 will be interpreted as March 4th, (20)05. In Europe, however, it can be interpreted as April 3rd, (20)05.

Re: Regexp for date
by ww (Archbishop) on Mar 25, 2008 at 03:16 UTC
    A universal regex would be like a universal solvent.

    Better, for most uses I can conceive at the moment, that you should look at CPAN/PPM (depending on your OS and prefs) for modules in the Date::xxx group.

Re: Regexp for date
by poolpi (Hermit) on Mar 25, 2008 at 11:51 UTC

    See Regexp::Common::time

    # For example: $RE{time}{YMD} # Strictest (equivalent to y4m2d2)

    hth,

    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb
Re: Regexp for date
by planetscape (Chancellor) on Mar 25, 2008 at 23:47 UTC

      This is a follow up post:

      I am thinking to have an much universal solution that match anything like a date, so that I can cut a chunk of text by date where a date appear (That is why I don't need them to be valid)

      The problem I encounter is: There is too much date format that appear in the document I need to process:
      2008-03-04
      2008-3-4
      2008-03-4
      2008-3-04
      or
      03-04-2008
      3-4-2008
      and so on hypen, common, slash are possible as seperator
      Also US and british style may appear

Re: Regexp for date
by swampyankee (Parson) on Mar 25, 2008 at 16:29 UTC

    With the particular format you're asking about, a regex shouldn't be that difficult, especially when you're not concerned with validity: something like this:

    m!^[0-9]{4}[/\-.][0-9]{1,2}[/\-.][0-9]{1,2}$#!
    which, if my regex memory works, will match a string which starts three groups of digits separated by slashes, hyphens, or periods, where the first group has 4 digits, and the second and third groups have 1 or 2 digits. I would do some minimal validation (to throw out clearly invalid values for month or day, such as 93), so it's more likely I'd use a regex like this one:
    m/^\d{4}[\/\-.](1[0-2])|0?[1-9])[\/\-.]([012]?[0-9]|3[01]$/<p>
    Which, if my regex juju is still working, will match a string comprising three groups, separated by slashes, hyphens, or periods, where the first group has 4 digits, the second a number in the range 1 through 12, and the third a number in the range 0 (oops!) through 31. I know there is no day of the month numbered 0, but I'll leave that as an exercise for the reader.

    Another way to do this is to use split, breaking the string on the required separator, something like this:

    ($year, $month, $day) = split(/[.\-\/]/, $date, 3);

    emc

    Information about American English usage here and here. Floating point issues? Please read this before posting.

Re: Regexp for date
by Your Mother (Archbishop) on Mar 25, 2008 at 23:05 UTC

    Assuming you mean it when you say "no need to be valid," swampyankee's ($year, $month, $day) = split(/[.\-\/]/, $date, 3) is nice enough and can be even simpler: ($year, $month, $day) = split(/\D+/, $date, 3). For MySQL I've used =~ /\A(\d{4})\D*(\d{2})\D*(\d{2})/ which works on 2008-03-25 as well as 20080325 (a format used in certain date fields in older versions, IIRC).

    For parsing arbitrary date text (in English anyway) there is Date::Manip (ParseDateString and ParseDate).

Re: Regexp for date
by DrHyde (Prior) on Mar 31, 2008 at 09:12 UTC
    If you don't need to only match valid dates, then try this:
    /./

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://676028]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-03-19 10:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found