Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

How long have you been sitting on my server?

by Anonymous Monk
on Jan 27, 2003 at 17:38 UTC ( [id://230284]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

We have a policy for our ftp server that prohibits people from leaving files for over 30 days on our server. So I came up with a Perl script that checks the age of the file and deletes the file if it's age is over 30 days.

The problem is that I am checking $mtime. And if someone has a file, which is over 30 days old, and uploads it to the server, the file is deleted immediately. What I need to know is how long that file has been on the server, not its $mtime. Would $atime help? I don't think so. According to perldoc, I could also check for inode change time with the -C switch. Would this meet my needs? I am not sure, and I hope someone has any ideas. Thanks!

  • Comment on How long have you been sitting on my server?

Replies are listed 'Best First'.
Re: How long have you been sitting on my server?
by Mr. Muskrat (Canon) on Jan 27, 2003 at 18:03 UTC
    Some things you should know about stat():
    • atime - last access time in seconds since the epoch
    • mtime - last modify time in seconds since the epoch
    • ctime - inode change time (NOT creation time!) in seconds since the epoch
    NOTE: *nix does not track creation time so you have to do it on your own.
Re: How long have you been sitting on my server?
by Jenda (Abbot) on Jan 27, 2003 at 19:25 UTC

    What about keeping a tied hash (DB_File, SDBM, GDBM or whatever) of the files you've "seen" and

    • if you encounter a file you did not see yet (it's not in the hash):
      • add it to the hash
      • change it's modification time to the current time
    • if you encounter a file that is in the hash and is older than 30 days:
      • delete it from the disk
      • delete it from the hash

    Jenda

      Thanks! I will try that.
Re: How long have you been sitting on my server?
by tomhukins (Curate) on Jan 27, 2003 at 18:50 UTC

    Mr. Muskrat has already answered your question directly, but I hope I can provide some useful ideas:

    If your users are technically knowledgeable, you might find they replace the older files with newer files to avoid your 30 day restriction. In this case, you might want to compare the contents of all files within each user's home directory with the contents of all files that were previously there.

    You could use Digest::SHA1 or Digest::MD5 as a checksum for the contents of each file, perhaps stored in a DBM database or similar. File::Find::Rule::Digest might help.

      Mr. Muskrat has already answered your question directly

      mtime and atime do not tell me how long the file has been on the server. ctime or the inode change time represents the time when the file's meta-information last changed. So ctime also doesn't tell me how long the file has been on the server.

      Just thinking off the top of my head here: Could I come up with a script that logs any uploaded file in a database or textfile? The script would have to record the file the instant it gets uploaded. How could I do this if we are using ftp to do the uploading? (This idea isn't perfect, since the user could always rename the file, but whatelse could i do? . . .)

        A simple yet sloppy fix would be to have your script create a .txt file, named the same as you upload file. Since it is created at time of upload have your script check the $mtime of the txt file and if it is 30 days old delete filename.* I like the hash answer much better, though.
Re: How long have you been sitting on my server?
by bart (Canon) on Jan 27, 2003 at 19:51 UTC
    You can have a cron job, or something like it, that keeps an inventory of all the uploaded files, and the date each got added, in some form of database — say, a flat file. If you run it once a day, you can't be wrong more than a day. An extra signature check, using Digest::MD5 for example, can store a fingerprint of the file, so you can see if the file contents haven't changed since it got added. You might use this as a signal to reset the age counter, for this file.
Re: (nrd) How long have you been sitting on my server?
by newrisedesigns (Curate) on Jan 27, 2003 at 18:16 UTC

    Why don't you walk over the directories in your webserver and check each file using the last modified file test operator?

    if(-M $_ > 30*86400){ unlink $_; }

    Hope this helps. If not, post a reply. Why don't you sign up for Perl Monks, while you're at it.

    John J Reiser
    newrisedesigns.com

      if(-M $_ > 30*86400){ unlink $_; }
      That's some ancient files, man! -M returns the age of the file in days.

      Why don't you walk over the directories in your webserver and check each file using the last modified file test operator?

      The last modified test operatior (-M) is exactly what I am using. I am sorry that I confused this with the age of the file. The problem is that a file could have been modified months ago, and when the file gets upload to the server, its last modified time doesn't change. I need to know how long the file has been on the server, not the last time it has been modified or accessed.

        What FTP server/operating system do you use?

        I have a "last updated" script on my server. It checks a few directories for HTML files and sorts them according to the return value of -M. If I upload a file using FTP to my server, the value returned from -M is set to the time of the upload. I just tested this on my server using a HTML file that was two months old. It shows up first in the list, being updated "0 seconds ago" (upon reload of the CGI script).

        I'm pretty sure that a file uploaded sets it's last modified value to whatever the time of upload was. If the file was copied to the FTP directory using your operating system and not via FTP, I can see how -M wouldn't work.

        Hope this helps.

        John J Reiser
        newrisedesigns.com

Re: How long have you been sitting on my server?
by OM_Zen (Scribe) on Jan 27, 2003 at 19:05 UTC
    Hi ,

    The file system , do they have a CVS repository which could have the cvs add date and time for the versions , the earliest should give you the date and time of the existence
      We are using proftpd on Solaris. We have the ftp directories on a separate partition (which doesn't include htdocs) and my boss wants to keep it that way. Would it be possible to use CVS, but have the files on a partition that Apache doesn't know about?
        Hi ,

        The partition , in which apache resides can be different from that of the CVS repository partition , it is just that you could maintain the repository anywhere and build a link to any partion and cvs add files to it and cvs update when files are updated , that way the files that are added have the creation time on it , though any other parameter , $atime,$ctime might get changed when you have the file modified, the CVS could retain the creation time.
Re: How long have you been sitting on my server?
by SysApe9000 (Acolyte) on Jan 27, 2003 at 22:16 UTC

    Here's the rub: your policy is... unusual. You seem to be saying that your users can upload a file, but it cannot stay for more than 30 days. Is there anything preventing them from uploading it again as soon as you delete it? Can they upload the same file again before you delete it? (Simply overwriting the file with the same thing...)

    If the answer to either of the previous two questions is yes you should write a script that scans the directories in question periodically and records the name of any new files and when they first appeared. Of course, you also have to remove files from your list when they are deleted. That of course will allow the user to upload the same file again as soon as you delete it from your database of files. If you don't take it off your list of files, they'll never be able to upload a file with the same name twice.

    If your objective is to simply delete files which have not changed in 30 days, that's much easier and can be done with the ctime and mtime values. Here's the deal with ctime: it changes any time the file meta-data changes. This means that when the file was uploaded it's mtime was set to the time when the last write completed. Then the system changed the mtime! This caused the ctime to be set to the time when the mtime was altered. As long as you don't change the meta-data yourself you can do something like this:

    $fl=$ARGV[0]; @st=stat($fl); $timeout=30*24*60*60; if($st[9] < (time()-$timeout) and $st[10] < (time()-$timeout)) { unlink $fl; }

    The above example uses the stat() built-in subroutine. I tend to prefer it to the -M and -C routines operators but I'm sure you can use whichever you want. Note that stat() returns the mtime as the ninth value and ctime as the 10th and that the values are in seconds since 00:00 January 1, 1970 GMT. -M and -C return the time in a differenct format.

    The above will cause the the file in $fl to be deleted if and only if no one has modified it in the last 30 days. This makes more sense to me, but it may not be what you're trying for...

      Unusual? CPAN/PAUSE doesn't allow re-uploading a file, so it must remember the file names of every file that's ever been uploaded, forever, even after it's been deleted.
        I suppose that makes sense for an archive site... but I assumed it wasn't an archive site, because he's deleting files after only 30 days. There wouldn't be much on CPAN/PAUSE if that were the case, would there?
Re: How long have you been sitting on my server?
by sauoq (Abbot) on Jan 27, 2003 at 20:01 UTC
    According to perldoc, I could also check for inode change time with the -C switch. Would this meet my needs?

    Have you experimented to see if using the ctime will help you? We don't know anything about your server afterall. I'm guessing that the ctime would work because it doesn't make much sense to preserve inode change time on files uploaded via ftp and the inode of the new file is certainly changed when it is created on the server.

    -sauoq
    "My two cents aren't worth a dime.";
    
      Have you experimented to see if using the ctime will help you?

      It might. But I'm not sure what it's recording. I had someone ftp an old file to the server, I then typed:

      perl -e "print -C 'somefile'"

      And the results were: 0.0172916666666667
      I then tried it again, and now the output was: 0.0216666666666667

      What's happening here exactly?

        That must be the "age" of the file, in days, since the start of the script. The values you got would be in the neighbourhood of 1/2 hour. Ladies and gentlemen, it looks like we've got a winner!
        What's happening here exactly?

        The same thing that happens with -M and -A. They provide the age in days since the script started. If you want seconds since the epoch, you'll need stat(). Looking at your output, it does look like ctime should work for you.

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: How long have you been sitting on my server?
by OM_Zen (Scribe) on Jan 27, 2003 at 19:28 UTC
    Hi,

    I actually mistook the mature of the files that you had, I am sorry to have that last reply of mine, actually , I guess as the time variables do not help you,

    You might think of having an ascii file with list of file name and the datetime of the new files alone for a day and then parsing the file and the current directory which has all the files ,to see if the datetime is thirty days old, sounds like a conventional way to handle it, yet the time variables $atime,$mtime do not give you the answer,the logic would give you a solution .

    in my first note I somehow thought it was the normal files like programs and thought that CVS would have the information of the earliest time the file was created and hence gave that answer , please ignore that
Re: How long have you been sitting on my server?
by CountZero (Bishop) on Jan 27, 2003 at 23:01 UTC

    Just a thought: does your FTP-program keep a daily log with names of uploaded files?

    If so, can you not just delete all files mentioned in the log which was created 30 days ago?

    You might want to check if the same file (or at least a file with the same name) was not uploaded later, so you can reset the timer.

    Dumping the log-file into a database shouldn't be too difficult and a simple SQL-query will give you a nice list of files to be deleted.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: How long have you been sitting on my server?
by waswas-fng (Curate) on Jan 28, 2003 at 00:48 UTC
    Check your proftpd configs, on all of my proftp servers (solaris 2.6 7, 8 and 9) mtime gets updated on file upload. I have no idea why your setup would be different -- in fact I don't even see how it is possible, FTP's STOR command does not transfer timestamp info about the file it is storing so how would it even know what the mtime was on the file on the client.

    -Waswas

    Edited:
    Here is the the like to the FTP RFC note a search for mtime atime and ctime all come up nil =)
Re: How long have you been sitting on my server?
by dopey (Sexton) on Jan 28, 2003 at 01:37 UTC
    Can't you write a script that changes the $mtime when a file is submited?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://230284]
Approved by valdez
Front-paged by Aristotle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (2)
As of 2024-04-19 19:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found