Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Grep logs by start date and end date in different directories

by thanos1983 (Vicar)
on Jan 03, 2018 at 10:50 UTC ( #1206593=note: print w/replies, xml ) Need Help??


in reply to Grep logs by start date and end date in different directories

Hello Anonymous Monk,

Welcome to the Monastery. Fellow Monks have provided you with answers but I found your question interesting so I spend some time to wrote a small script that if I understand correctly from your description should do exactly what you want.

Sample of code:

#!/usr/bin/perl use strict; use warnings; use Date::Manip; use Data::Dumper; use File::Find::Rule; sub get_files { my (@dirs) = @_; my $level = shift // 2; # level to dig into my @files = File::Find::Rule->file() ->name('access.log', 'sys.log') ->maxdepth($level) ->in(@dirs); return @files; } sub searchForIP { my ($files, $ip) = @_; local @ARGV = @$files; while (<>) { print "$ARGV:$.:$_" if /$ip/; } continue { close ARGV if eof; } return; } my $numberOfDays = '2 days'; my $dateStart = ParseDate("today"); my $dateEnd = DateCalc($dateStart, $numberOfDays); # To find the every day date1 to date2 my @dates =ParseRecur("0:0:0:1:0:0:0","",$dateStart, $dateEnd); my @datesFormatted = map { UnixDate($_, '%Y-%m-%d') } @dates; # print Dumper \@datesFormatted; my @files = get_files(@datesFormatted); # print Dumper \@files; my $ip = "127.0.0.1"; searchForIP(\@files, $ip); __END__ $ perl test.pl 2018-01-03/access.log:1:127.0.0.1 This is insident 1 in 2018-01-03 2018-01-03/access.log:4:127.0.0.1 This is second insident 4 in 2018-01 +-03 2018-01-05/sys.log:1:127.0.0.1 This is insident 1 in 2018-01-05 2018-01-05/sys.log:4:127.0.0.1 This is second insident 4 in 2018-01-05

I used the modules Date::Manip for the date calculation, File::Find::Rule to traverse the directories and get the files (you could have used the core module File::Find) and finally the debugging module Data::Dumper.

Data that I used to get the output that I am showing:

$ ls -la total 40 drwxr-xr-x 8 tinyos tinyos 4096 Jan 3 11:37 . drwxr-xr-x 5 tinyos tinyos 4096 Jan 2 20:38 .. drwxr-xr-x 2 tinyos tinyos 4096 Jan 3 10:01 2018-01-01 drwxr-xr-x 2 tinyos tinyos 4096 Jan 3 10:02 2018-01-02 drwxr-xr-x 2 tinyos tinyos 4096 Jan 3 11:33 2018-01-03 drwxr-xr-x 2 tinyos tinyos 4096 Jan 3 10:02 2018-01-04 drwxr-xr-x 2 tinyos tinyos 4096 Jan 3 11:34 2018-01-05 drwxr-xr-x 2 tinyos tinyos 4096 Jan 3 11:27 2018-01-06 -rw-r--r-- 1 tinyos tinyos 1230 Jan 3 11:37 test.pl -rw-r--r-- 1 tinyos tinyos 414 Jan 3 10:26 test.pl~

Each directory contains two files same as your description.

$ ls -la 2018-01-01/ total 8 drwxr-xr-x 2 tinyos tinyos 4096 Jan 3 10:01 . drwxr-xr-x 8 tinyos tinyos 4096 Jan 3 11:37 .. -rw-r--r-- 1 tinyos tinyos 0 Jan 3 10:01 access.log -rw-r--r-- 1 tinyos tinyos 0 Jan 3 10:01 sys.log

In some of the files I added the IP that you are searching and also some dummy text (incident error report). Sample of one file bellow:

$ cat 2018-01-03/access.log 127.0.0.1 This is insident 1 in 2018-01-03 127.0.0.2 This is insident 2 in 2018-01-03 127.0.0.3 This is insident 3 in 2018-01-03 127.0.0.1 This is second insident 4 in 2018-01-03

If I understand correctly from your description something like that should do what you need. If not it should be close to 95% minor modifications to bring it close to your desired output.

Hope this helps, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

Replies are listed 'Best First'.
Re^2: Grep logs by start date and end date in different directories
by Anonymous Monk on Jan 05, 2018 at 01:46 UTC
    Hi, that helped me loads. Thanks a lot. However, is it possible if I put ->name('*.bz2') ? This is because all my log files are compressed into bz2 format. I have tested it but it didnt seem to work when I put *.bz2. It only works when I put it as *.log. Any idea why? Once again, thank you so much.
      Sorry to bother again but I am kinda urgent on this so I am working this on my own but at the same time I hope i get more insights from professionals which can allow me to do it in a better way. My current script also searches for IP in a network range from all the log file. This is the code that does what I've mentioned:
      use Net::Subnet; if (@ARGV){ while (<>) { my @ips = m/(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}( +?:25 +[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)/g; next unless @ips; next unless grep { $matcher->($_) } @ips; print $fh $_; }
      Do you know how I can implement this into your code? Thanks again

        Hello Anonymous Monk,

        Apologies for the late reply, but I just noticed your reply to my comment.

        It is very open your questions, I am not sure what do you mean with My current script also searches for IP in a network range from all the log file. network range can vary greatly. Give a bit more specific information e.g. 127.0.0.1 - 127.0.0.255 what is the range, how the IP will be imported? I mean you will import IP e.g. 127.0.0.1 and you want to check what IP are matching what the network, subnet, range? On your log files is this exact IP exists? Or are you looking for any number that consists of 1-255.1-255.1-255.1-255?

        We need sample of data in the files to see the format. For example you just mentioned that you are having bz2 files and fellow Monk haukex proposed a module and a few similar questions.

        So help us with more specific information to help you.

        Hope this helps, BR.

        Seeking for Perl wisdom...on the process of learning...not there...yet!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1206593]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2018-07-19 10:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?















    Results (406 votes). Check out past polls.

    Notices?