Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

maintain control over very many files

by costas (Scribe)
on Feb 25, 2002 at 10:49 UTC ( [id://147294]=perlquestion: print w/replies, xml ) Need Help??

costas has asked for the wisdom of the Perl Monks concerning the following question:

I have a large directory with very many sub folder and thousands of files. The content within the files are constantly updated and new files are constantly added.

Using a cron i want to scan through every file in the directory and pick out any new files that have been recently created and any files which have recently been modified.

This is an extremely significant project and the end result will be responsible for the maintenace of an important website (that is why i put this email to you monks :-) ). For anybody who is wondering, we have been using programs such as rendezvous for the purpose of constantly scanning files but it is not satisfactory.

Could anybody please help me by telling me:

- What modules or script exist which could help me achieve TOTAL control and maintenance over my files. I need it to be 100% stable and reliable. ( is uppose that goes without saying)

Thankyou in advance

Costas

Replies are listed 'Best First'.
Re: maintain control over very many files
by clemburg (Curate) on Feb 25, 2002 at 12:06 UTC
Use File::Find and stat - Re: maintain control over very many files
by metadoktor (Hermit) on Feb 25, 2002 at 12:18 UTC
    You can use File::Find to traverse the specific directory that you want to look at and stat each file for the modify time as you process it using File::Find.

    As for keeping track of new files you can always save the previous directory contents and compare them to whatever is in there now. If your list of files is truly huge then you may have to employ special tricks to optimize your compares.

    metadoktor

    "The doktor is in."

      No, you don't want to go that way. There are tools out there (see above) that can do it all for you. Why reinvent the wheel for the nth time.

      Christian Lemburg
      Brainbench MVP for Perl
      http://www.brainbench.com

      _IF_ you want to do such, you will have to remember of each file, what the last modifytime was - where are you going to store this?

      The trick is (that is _IF_ you _REALLY_ want this) to touch a .time file (or some other hidden file) when your done with that dir, and then process the next. This way all newer then .time is newer... (if . is <= .time you don't even have to look in that dir...)

      Sinister greetings.
      "With tying hashes you can do everything God and Larry have forbidden" -- Johan Vromans - YAPC::Europe 2001
      perldoc -q $_
Re: maintain control over very many files
by rinceWind (Monsignor) on Feb 25, 2002 at 11:03 UTC

    costas, you didn't say what operating system platform you are running on. This is an important consideration, as there are differences in quota management, access control lists, etc.

    I presume it is some varety of Unix as you mention Cron.

    In terms of wanting _TOTAL_ control of your files, please can you be more specific as to what you want.

Re: maintain control over very many files
by Sinister (Friar) on Feb 25, 2002 at 11:01 UTC
    I guess I can't tell you much if you don't supply a little context.

    For instance, do these new files need to be uploaded somewhere else? Then use rsync.
    Do you need to just report these files? Use a perl-script.


    Sinister greetings.
    "With tying hashes you can do everything God and Larry have forbidden" -- Johan Vromans - YAPC::Europe 2001
    perldoc -q $_
Re: maintain control over very many files
by costas (Scribe) on Feb 25, 2002 at 12:03 UTC
    The directory i am scanning is a local copy of a website. Any files that with recent date modification will then be FTP' to the live site.

    I am running the site on IIS. its a total microsoft environment but i have active perl installed on the machine.
Re: maintain control over very many files
by costas (Scribe) on Feb 25, 2002 at 12:06 UTC
    i mentioned cron since i am used to working in unix. But i will be using the ms equivalent.
Re: maintain control over very many files
by costas (Scribe) on Feb 25, 2002 at 12:11 UTC
    i essentially want to check file date modification only.

    i am also running on iis.
Re: maintain control over very many files
by costas (Scribe) on Feb 26, 2002 at 14:59 UTC
    I have now expanded on my needs and am looking for the following:

    A perl script/prog which can monitor many folders on a local mchine (each folder representing a different site). THe perl script will be executed on a timly basis. If a new or modified file is found, the file will then be FTP'd to its relevant hosting machine.

    All platforms are microsoft with IIS.

    Can anybody recommend or tell me the best solution. Or is anybody already using a perl based solution such as this already? NOte that multiple site flows must be maintained.

    thanks costas

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://147294]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-03-19 11:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found