Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Grep logs by start date and end date in different directories

by Anonymous Monk
on Jan 03, 2018 at 06:47 UTC ( #1206578=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I am a beginner to Perl so please understand.

So my current code calculates the end date when given the start date and the number of days user entered for. I have many directories that contains different log files such as access logs, system logs and etc. The directories are named according to date; e.g.,

Directory name: 2017-12-08 Inside this directory: access.log sys.log

What I want to achieve is when user key in the IP address, start date and number of days, it will grep through the logs(that contains the IP keyed in) from the start date to the end date. So for e.g. if user key in 2017-12-08 and 2 days, all logs from 2017-12-08 to 2017-12-10 will be grep and printed. This is my current code

use strict; use warnings; use Time::Piece (); use Time::Seconds; #Ask for IP address print "Enter an IP address to lookup: "; my $ipAddress = <STDIN>; # I moved chomp to a new line to make it more + readable chomp $ipAddress; # Get rid of newline character at the end #Ask for number of days print "Enter no. of days: "; my $numdays = <STDIN>; chomp $numdays; my $dt = Time::Piece->strptime( $sdate, '%Y-%m-%d'); $dt += ONE_DAY * $numdays; my $edate = $dt->strftime('%Y-%m-%d'); if ($numdays == 1){ my $result = `grep -R -E '$ipAddress' $LogDir | grep -E '$ +sdate'`; if ($result){ print $result; }else{ print "No result found from $sdate. Please try changin +g your IP address/start date and try again.\n"; } } elsif($numdays > 1){ my $result = `grep -R -E '$ipAddress' $LogDir | sed -n '/$ +sdate/,/$edate/{/$edate/d; p}'`; if ($result){ print $result; }else{ print "No result found from $sdate. Please try changing yo +ur IP address/start date and try again.\n"; } }

How can I grep through the dates? currently my code only grep the dates in actual logs itself but this is not a good solution as my logs date format are very inconsistent hence I want to grep via the directory name instead. Any help would be greatly appreciated

Replies are listed 'Best First'.
Re: Grep logs by start date and end date in different directories
by Discipulus (Monsignor) on Jan 03, 2018 at 09:02 UTC
    Hello and welcome to the wonderful world of Perl!

    as general advice avoid to shell out if Perl lets you to do something in it's own way: search afoken's threads about shell to know why and how.

    Anyway if you need to build up such list of directories the better, imho, approach is looping different time values going backward. consider the following little snippet:

    foreach my $day(0..15){ my @ymd = (localtime(time - 3600*24*$day))[5,4,3]; # build up an + array of just year,month and day, n days backward say join'-',$ymd[0]+1900,(sprintf '%02d',$ymd[1]+1),(sprintf '%02 +d',$ymd[2]); #print them in the format you need } 2018-01-03 2018-01-02 2018-01-01 2017-12-31 2017-12-30 2017-12-29 2017-12-28 2017-12-27 2017-12-26 2017-12-25 2017-12-24 2017-12-23 2017-12-22 2017-12-21 2017-12-20 2017-12-19

    With such strings then build up the full path you need then use glob to build up a list of files and finally open them to search your wanted IPs.

    If you organize this using subroutines all will be clean and wise! You can the add GetOpt::Long to admit a --foreward option to process in reverse order directories, just @dirs_date = reverse @dirs_date;

    If you avoid the shell you can generalize the program to search not only IPs but whatever you want letting the user to enter a custom regex possibly passed in command line as argument.

    # untested... foreach my $filepath (@paths){ open my $fh,'<', $filepath or die "unable to open $filepath!"; while (<$fh>){ if($_ =~ /$usr_supplied_regex/){ print "$filepath:$.".$_; } } }

    PS consider to sign in to the monastery!


    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Grep logs by start date and end date in different directories
by karlgoethebier (Monsignor) on Jan 03, 2018 at 10:37 UTC
Re: Grep logs by start date and end date in different directories
by haukex (Abbot) on Jan 03, 2018 at 19:04 UTC
    • The code you posted does not compile as is, which makes it harder for those trying to help. In this case, I'm guessing that you may have edited out too much for posting - you could have instead replaced the user input with simple variable assignments, e.g. my $sdate = '...';. Please see SSCCE.
    • You should in general try to call external programs as little as possible, and I wrote about some of the issues with doing so here. In general, Perl can do anything that sed and awk can, and a grep can be written in Perl as:
      open my $fh, '<', $filename or die "$filename: $!"; while (<$fh>) { if (/pattern/) { # do something ... print $_; } } close $fh;
      See Files and I/O, I/O Operators, perlrequick, and perlretut.
    • The dots in an IP address will be interpreted as a special character in a regular expression: the dot matches anything, so if the user enters "12.34", that'll also match e.g. "12534". To avoid that, use quotemeta or \Q...\E.
    use warnings; use strict; use DateTime; use DateTime::Format::Strptime; use Path::Class qw/dir/; my $LOGPATH = '.'; my $STARTD = '2017-12-08'; my $NUMDAYS = 3; my $PATTERN = ''; my $strp = DateTime::Format::Strptime->new(on_error=>'croak', pattern => '%Y-%m-%d', time_zone=>'local'); my $dt = $strp->parse_datetime($STARTD); my @files; for (1..$NUMDAYS) { my $date = $dt->strftime('%Y-%m-%d'); push @files, sort grep { $_->basename=~/\.log\z/i } dir($LOGPATH,$date)->children; $dt->add(days=>1); } local @ARGV = @files; while (<>) { chomp; if (/\Q$PATTERN\E/) { print "$ARGV:$.: $_\n"; } } continue { close ARGV if eof }
    • It's very good you're using a module like Time::Piece for your date handling. Personally I like DateTime because it does a whole lot more, along with DateTime::Format::Strptime for parsing, which is why I used those above.
    • I'm using Path::Class for getting filenames, where $_->basename=~/\.log\z/i matches those files whose names end in .log (case-insensitively).
    • I used a trick and assigned the list of files to the special @ARGV variable, which normally holds the command line arguments, so that I can make use of Perl's special while (<>) loop, described in I/O Operators. Inside that loop, the current filename is stored in $ARGV, and the line number in $. - but see the documentation on eof as for why I need the snippet of code close ARGV if eof. (If all that is too much magic for now, you can also wrap the grepping code I showed at the top in a for my $filename (@files) { ... } loop.)
    • As opposed to your description, I have taken the "number of days" to include the start day, since that makes more sense to me.
Re: Grep logs by start date and end date in different directories
by thanos1983 (Vicar) on Jan 03, 2018 at 10:50 UTC

    Hello Anonymous Monk,

    Welcome to the Monastery. Fellow Monks have provided you with answers but I found your question interesting so I spend some time to wrote a small script that if I understand correctly from your description should do exactly what you want.

    Sample of code:

    I used the modules Date::Manip for the date calculation, File::Find::Rule to traverse the directories and get the files (you could have used the core module File::Find) and finally the debugging module Data::Dumper.

    Data that I used to get the output that I am showing:

    Each directory contains two files same as your description.

    In some of the files I added the IP that you are searching and also some dummy text (incident error report). Sample of one file bellow:

    If I understand correctly from your description something like that should do what you need. If not it should be close to 95% minor modifications to bring it close to your desired output.

    Hope this helps, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
      Hi, that helped me loads. Thanks a lot. However, is it possible if I put ->name('*.bz2') ? This is because all my log files are compressed into bz2 format. I have tested it but it didnt seem to work when I put *.bz2. It only works when I put it as *.log. Any idea why? Once again, thank you so much.
        Sorry to bother again but I am kinda urgent on this so I am working this on my own but at the same time I hope i get more insights from professionals which can allow me to do it in a better way. My current script also searches for IP in a network range from all the log file. This is the code that does what I've mentioned:
        use Net::Subnet; if (@ARGV){ while (<>) { my @ips = m/(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}( +?:25 +[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)/g; next unless @ips; next unless grep { $matcher->($_) } @ips; print $fh $_; }
        Do you know how I can implement this into your code? Thanks again

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1206578]
Front-paged by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2018-03-19 00:16 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (231 votes). Check out past polls.