Beefy Boxes and Bandwidth Generously Provided by pair Networks Russ
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Monitor directories or files have any change

by benlaw (Scribe)
on Oct 27, 2005 at 21:55 UTC ( [id://503526]=perlquestion: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.

benlaw has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I want to create a cron job which monitor directories, sub-directories, files have any modification, is there any perl module can do it? or software (not include tripwire). Thanks a lot.

Thanks for all quick reply, all your reply are really great for me.My suituation is I need to monitor a linux's www directory, include sub-directories, files, db files(mysql), and give out Alert to system when file/directory has been add/modify/del. At the begining, I am writing a Perl monitor by "ls -laR" but a guy ask me if a file content with different content but same name, same date/time, same size, what will happen~ oh yes, I try to change my script to checksum with MD5, however, it seems it has some problem when i using MD5. (The case is i renname file bigfile.zip to bigfile.txt, then echo a >> bigfile.txt, checksum of bigfile.txt is the same hash code although they are different!). So I seeking another file signature method. Thanks
  • Comment on Monitor directories or files have any change

Replies are listed 'Best First'.
Re: Monitor directories or files have any change
by sgifford (Prior) on Oct 27, 2005 at 22:18 UTC
    I'm not aware of any. The basic strategy you probably want is to save a list of whatever you want to monitor for changes (file names? Sizes? Modification times?) in a file or database, then periodically check whether the current state is different from your last saved state.

    A very simple way to do this is to save the output of ls -l in a file, then re-run the command and see if the output is different.

    To monitor directories from a continuously running program, SGI::FAM might help.

      Thanks , SGI::FAM quite difficult to me~ especially magicrcs
Re: Monitor directories or files have any change
by EvanCarroll (Chaplain) on Oct 27, 2005 at 22:35 UTC
    You could
    1. Create a global hash %db, or use a database etc.
    2. File::Find send data to
    3. if ( -f ) { $file = File::Spec::catfile( FILE_FIND_INFO ); $db{$file} = $Digest::MD5{ $file }; }

    4. Then run again, and either alert the user if one of the MD5s is different, or save the hash using Storable, and compare it to a newley generated hash.

    Or something on those lines.


    Evan Carroll
    www.EvanCarroll.com
      Thanks, Evan Carroll, I found some interest thing in MD5. I try ren file : bigfile.zip (over 200MB) to bigfile.txt , then type echo a >> bigfile.txt, i found MD5 hash value does not change after append "a" in it~ I feel weird in this case, although this is not related to perl topic. ^^"

        That's probably because there is a ctrl-Z (ascii 26) somewhere in the zip file. When you renamed it to .txt, it got treated as a non-binary file and only the contents upto the (first) ctrl-Z got processed. When you appended an 'a' to it an ran it again, the same thing happened. Only the first part of the data to the first ^Z was processed. Hence the md5 didn't change.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Monitor directories or files have any change
by graff (Chancellor) on Oct 27, 2005 at 22:41 UTC
    Many monks here will typically suggest File::Find, but if this is for a unix system (and you don't need to worry about portability to non-unix), you'll be much better off using the standard "find" utility -- not only is it easier to use and better documented than File::Find, it is also significantly faster and places a lot less load on a cpu.

    I believe all versions of "find" have a "-newer" option, that will list only files/directories that have been created/modified more recently than a specified time. (The bsd version seems to have the best range of flexibility I've seen for this option, but the GNU version is likely to be just as good. If you have the Solaris version, um... you should probably get the GNU version.) It should be easy to work out how to provide the right "-newer" parameter each time the cron job runs.

    (GNU 'find' is available for MS-Windows as well, if you are somehow using a Windows version of "cron".)

    (updated to fix spelling error)

Re: Monitor directories or files have any change
by sauoq (Abbot) on Oct 27, 2005 at 23:17 UTC
      Thx for sauoq, it File::Signature suitable for direcotry and sub-directory? I have read the doc, but it seems it has not talk about it. Thanks

        That's a good question. The answer seems to be yes, but there may be caveats. It open()s the pathname for reading and uses Digest::MD5 to generate a fingerprint. Thinking about that, I'm really not sure if that behavior is well-defined. It doesn't seem to cause a problem on Linux at any rate. The fingerprint is the same as for an empty file. That's okay because the module also keeps info from stat() and that's what you are worried about when detecting whether a directory has changed. It checks the inode, mode, uid, gid, size, and mtime. If that's sufficient for you, I'd give it a go.

        I've used File::Signature on Solaris to create a tripwire-esque tool for monitoring data integrity but it wasn't for general use; it had a limited scope and I don't think the issue of monitoring directories ever came up.

        I've also just noticed that make test fails one test if you run it as root. It's not of any consequence though because it is testing an error condition which doesn't occur for root. Specifically, it is creating an unreadable file and testing the error returned when it tries to create a signature for the unreadable file. But, root can read it despite its mode, so there is no error and the test fails.

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: Monitor directories or files have any change
by jesuashok (Curate) on Oct 28, 2005 at 00:39 UTC
    Hi,,

    Here you can use the system calls that are available in Perl. Because Perl has the feature using system calls available in 'C' library as module to this one.

    All this functionalities are already available as a module in perl.

    So using these module you can directly access the Inode entry of the file system then you can get the relevant information.

    If you write in those sort of program it will be faster and useful to all the applications you develop.

    "Keep pouring your ideas"
Re: Monitor directories or files have any change
by tirwhan (Abbot) on Oct 28, 2005 at 07:36 UTC
    Non-perl answer, but if you're doing this for any kind of security-related reason you may want to take a look at Samhain, which is a full-featured file integrity checking system that can be installed either as a daemon or a cronjob.

    Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan
Re: Monitor directories or files have any change
by zentara (Cardinal) on Oct 28, 2005 at 08:57 UTC
    Doing md5sums on a large directory of large files can take alot of your cpu time. When I messed with it before, I used sum (a checksum) instead. See "man sum" or "sum --help". It is not as accurate as md5, but it easily detects a newline.

    I'm not really a human, but I play one on earth. flash japh
      I've noticed that 'sum' varies by platform and tends to use an older 16-bit algorithm. I've had fairly good luck with the similar cksum.
Re: Monitor directories or files have any change
by radiantmatrix (Parson) on Oct 28, 2005 at 11:05 UTC

    What about using a version-control system?

    Check your web directory into, say, CVS. CVS can then tell you if there are any files in your directory that are not in CVS (newly created) or which have contents different from those in CVS (modified). Doing this has the added advantage that you can quickly and easily restore the canonical version, and you can easily find out exactly what changed.

    If CVS is the version system you choose, it can be handled very perlishly: there are a number of Cvs modules to use.

    <-radiant.matrix->
    A collection of thoughts and links from the minds of geeks
    The Code that can be seen is not the true Code
    "In any sufficiently large group of people, most are idiots" - Kaa's Law
Re: Monitor directories or files have any change
by graff (Chancellor) on Oct 29, 2005 at 13:59 UTC
    Responding to this part of your update:
    a guy ask me if a file content with different content but same name, same date/time, same size, what will happen
    To clarify: the question here is: "what if the content of a particular file happens to change, while its name, its size and its modification date all remain unchanged?"

    If that is a correct restatement of the question, I think the answer would be that this could only arise from two possible things happening:

    1. There has been some damage or corruption to the hard disk, altering the data content of the blocks allocated to this file, or
    2. Someone alters the file contents in some normal way (which changes the modification time in its directory entry), and then deliberately uses the unix "touch" command to reset the modification date to its previous value.
    Apart from those two things, I don't know of any way for file contents to change without altering the modification time/date field in the file's directory entry.

    Usually, if the first type of problem happens, it has a much wider impact (e.g. the whole disk becomes non-functional). As for the second type of problem, if you really do have to watch out for that sort of trickery, then you certainly do want to maintain checksums on all data files (and take extra steps to protect the checksum list from unauthorized access).

    If there isn't a plausible risk of the latter sort of problem, then just checking directory trees with "find", looking for recently modified files, should suffice.

Re: Monitor directories or files have any change
by benlaw (Scribe) on Nov 08, 2005 at 01:27 UTC
    Hi all, thanks for all expert. I try to coded the program, not a bug-free program. I have tested in few days and it is workable in my case. However, it seems take long time when "MD5" big file.

    config file

    # ver_check.cfg path=/var/www master=master current=current log=version_check.log
    code file
    # version_check.pl #! perl -w # Not a bug-free program use Digest::MD5; # _______________________________ # / / # / Generate List Variablea / # /______________________________/ my $flag1=1; my $flag2=0; my %filelist = (); my %md5 = (); my $current_path = $_; my $path = $_; my $filename = $_; # _______________________________ # / / # / Compare Lists Variablea / # /_____________________________ / my $masterfile=$currentfile=$_; my %a=(); my %b=(); #================================== if($#ARGV > -1){ foreach (@ARGV){ if (/\-file\:(\w.*)/){ $filename = $1; chomp($filename); GENLIST(); }elsif(/\-comp$/){ COMPARE(); #}elsif(/\-log\:(\w.*)/){ # my $logfile = $1; # LOG($logfile); }elsif(/\-v$/){ VERBOSE(); }else{ print "option: $ARGV[0]\n"; OPTION(); } } }else{ OPTION(); } # Generate a list sub GENLIST{ open(CONF, "ver_check.cfg"); while(<CONF>){ if (/path\=(.*)$/){ $path = $1; } } close(CONF); open (LIST, "ls -lAR $path|"); while(<LIST>){ if(($flag1==0) && (/^$/)){ $flag1++; $flag2=0; }elsif(($flag1==1) && (/(.*)\:$/)){ $flag1=0; $flag2=1; $current_path = $1; }elsif(($flag2==1) && (/^(([dsrwx\-]{10})\s*(\d{1,})\s +*([\w\-]*)\s*([\w\-]*)\s*(\d*)\s*([\w\s\:]{1,12}))\s*(\w .*)$/)){ #$1=whole line b4 filename $2=permission $4,$5=user\gr +oup name $8=filename #drwxr-xr-x 2 root root 4096 Oct 23 20 +03 cgi-bin my $desc = $1; my $filename = $8; my $key = "$current_path/$filename"; ## base + on which system if($desc!~/^d/){ my $x = MD5("$key"); $desc = "$desc\&$x"; } $filelist{$key} = $desc; } } close(LIST); open(OUTPUT, "> $filename"); foreach (keys %filelist){ print OUTPUT "$_|$filelist{$_}\n"; } close(OUTPUT); } # end of GENLIST sub OPTION{ print "Wrong\/missing operation!\n\n"; print "You should write in this way\:\n"; print "\t\-file\:\[file\]\tCreate file list follow by the file + name\n"; print "\n"; exit; } # end of OPTION sub MD5{ my $file = $_[0]; my $md5code; open(FILE, $file) or die "Can't open '$file': $!"; binmode(FILE); $md5 = Digest::MD5->new; while (<FILE>) { $md5->add($_); } close(FILE); $md5code = $md5->b64digest; $md5{$file} = $md5code; return $md5code; } # end of MD5 ######## Comparison part sub COMPARE{ my @error = @_; open(CONF, "ver_check.cfg"); while(<CONF>){ if(/^master\=(.*)$/){ $masterfile = $1}; if(/^current\=(.*)$/){ $currentfile = $1}; } close(CONF); %a=READIN("$masterfile"); %b=READIN("$currentfile"); foreach (keys %a){ if(!exists $b{$_}){ push(@error, "File/Dir deleted: $_\n") }elsif($a{$_} eq $b{$_}){ delete $a{$_}; delete $b{$_}; }elsif($a{$_} ne $b{$_}){ $a{$_} =~ s/\&/ /; $b{$_} =~ s/\&/ /; push(@error, "File/Dir modified: $_\n$masterfi +le : $a{$_}\n$currentfile : $b{$_}\n\n"); delete $a{$_}; delete $b{$_}; } } HASHSIZE_DECI(\%b, "current"); LOG(\@error); } # end of COMPARE sub READIN{ my $filename = $_[0]; my @temp=@_; my %temphash=(); open(FILE, $filename); while(<FILE>){ if(/\|/){ chomp($_); @temp = split(/\|/,$_); $temphash{$temp[0]}=$temp[1]; } } close(FILE); return %temphash; } # end of READIN sub HASHSIZE_DECI{ my $temp = shift; my $file = shift; if((scalar (keys %$temp)) > 0){ foreach (keys %$temp){ if ($file eq "master"){ push(@error, "File/Dir deleted: $_\n") }elsif($file eq "current"){ push(@error, "File/Dir added: $_\n") } } } } # end of HASHSIZE_DECI sub LOG{ my $error = shift; my $logfile = ""; open(CONF, "ver_check.cfg"); while(<CONF>){ if(/log\=(\w.*)/){ $logfile = $1} } close(CONF); chomp($logfile); if($logfile eq ""){ foreach (@$error){ print $_; } }else{ open(LOG, "> $logfile"); foreach (@$error){ print LOG $_ } close(LOG); } } sub VERBOSE{ }
    I generate original list first
    perl version_check.pl -file:master
    then create a cron job
    perl version_check.pl -file:current -comp
    all log will save as version_check.log

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://503526]
Approved by neversaint
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.