Re: regex-matching the date
by Trimbach (Curate) on May 12, 2001 at 16:44 UTC
|
There's a couple of ways to do this, using either a regex or (if your dates are always well-formed) 'split.' If you use a regex you need to do two things: 1) add a ^ character to anchor your regex to the beginning of your scalar. As it is your regex will match ANYWHERE in the string, not just the beginning. Using the ^ character you can restrict the match to the beginning of the string, which is what you want. Also, 2) you'll probably need to add capturing parentheses around the parts of the regex you're interested in. A matching regex without capturing parentheses only returns a "true" or "false" depending on whether a match is found or not. It does NOT return the match itself. (Well it does, but only if you use some funny variables... not recommended.) Like this:
#!/usr/bin/perl -w
use strict;
my $date= 'Sun Apr 1 10:27:03 CDT 2001';
if ($date =~ m/^(\w{3}\s+\w{3}\s+\d+)/) {
print "Matched $1\n";
}
Alternatively, you can use 'split' (this would be my choice.) Split will split the string up into space-divided chunks... so long as the order of the chunks doesn't change it's all good:
my $date= 'Sun Apr 1 10:27:03 CDT 2001';
my ($weekday, $month, $day) = split " ", $date;
print "$weekday $month $day\n";
Perlman perlfunc has more details on split if you're interested. Enjoy!
Gary Blackburn
Trained Killer | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: regex-matching the date
by Albannach (Monsignor) on May 12, 2001 at 17:43 UTC
|
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
correct me if I'm wrong, in order to use the split function, I need
to use the regex in order to find where it is in the file, then place
it into a variable, then split it? Even still, I think that is something
I will use for formating purposes. I was able to get it working finally, I
found I was making a newbie mistake, I was testing for the match, but I wasn't
assigning it into a variable. My question now is on how I assigned it to a variable.
if /(\w{3}\s+\w{3}\s+\d+)/){
$foo = $&;
}
how does this differ from using
$foo = $1;
If I am running through a long file, and $foo is changeing
frequently, will one work better then the other?
By the way, I like the last solution you used. The date will
always be in the same format, and I am pretty sure that there
will never be anything else with the same pattern, but then again
never say never.
thanks for all the help...I'm currently struggling through other
regex problems But so far I have worked most of them out on my
own which I prefer to do before asking the monks.
Stuffy
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
WARNING: Once Perl sees that you need one of $&, $`, or $'
anywhere in the program, it has to provide them for every
pattern match. This may substantially slow your program.
In other words, you can use $& (i used to, im an ex-sed hacker), but it will slow things down and its kind of unmaintainable. $1 $2 $3 et cetera, are really shinier, happier codelets.
brother dep.
--
Laziness, Impatience, Hubris, and Generosity. | [reply] [Watch: Dir/Any] [d/l] |
|
Brother dep has covered the downside of $&, but on your
split question, the beauty there is that you don't need to match
the (sometimes) complex target of your interest, just the separators that mark where
your interest ends, and that's often a lot easier. In this case, if you're going
to verify the date anyway, there is not much sense in going to great lengths to
do that in the regex, so you can just split on whitespace instead.
As you noted, split
won't be able to find your dates at all. It is a great option if you
are parsing some sort of log file in which the lines always start with that date
format, but if you want to get that date out of the middle of a lot of other text,
a specific regex would be my choice, and instead of split you can use $1 etc. to
get your date components, like:
if /(\w{3})\s+(\w{3})\s+(\d+)/){
($day, $month, $daynum) = ($1, $2, $3);
}
Finally, while we're talking about OWTDI, you might also consider unpack
for jobs like this as it is usually faster, though it is even more fussy about
the format of the data being consistent. It is however ideal for fixed-width columns of data
(anyone else still dealing with data in card images?).
--
I'd like to be able to assign to an luser | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: regex-matching the date
by Chady (Priest) on May 12, 2001 at 11:46 UTC
|
looks Ok to me... and maybe you need to /^\w{3}\s\w{3}\s\d/ or check out time if you are doing this based on the time the script is running..
He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.
Chady | http://chady.net/
| [reply] [Watch: Dir/Any] [d/l] |
Re: regex-matching the date
by Eureka_sg (Monk) on May 12, 2001 at 11:41 UTC
|
Note that the '+' modifier is greedy so your regex will match the entire
string You can use '?' to make it non-greedy
/(\w{3}\s+?\w{3}\s+?\d+)/
and the result will be stored in $1.
UPDATE: Ignore this post.
| [reply] [Watch: Dir/Any] [d/l] |
Re: regex-matching the date
by stuffy (Monk) on May 12, 2001 at 11:28 UTC
|
had a typo, my regex should be
/\w{3}\s+\w{3}\s+\d+/
Stuffy
| [reply] [Watch: Dir/Any] [d/l] |
|
Hrmmm ... it seems to work for me - I'm not entirely sure that I know what in context you are trying to use it ... Try this ...
$var = "Sun Apr 1 10:27:03 CDT";
if ($var =~ /(\w{3}\s+\w{3}\s+\d+)/) {
print $1."\n";
};
This worked fine for me in my testing - Note the additional brackets around the regex to allow the result to be pulled from $1. If you are still having problems, post again with a bit more context as to where you are using this regex and someone with more experience than myself may be able to help further.
| [reply] [Watch: Dir/Any] [d/l] |