Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Re: Efficient processing of large directory

by Elliott (Pilgrim)
on Oct 03, 2003 at 15:37 UTC ( [id://296295]=note: print w/replies, xml ) Need Help??


in reply to Re: Efficient processing of large directory
in thread Efficient processing of large directory

I've tried it now with while ... and it timed out :-(

Looks like I'd better try subdirectories too.

Replies are listed 'Best First'.
Re^3: Efficient processing of large directory
by Aristotle (Chancellor) on Oct 03, 2003 at 22:19 UTC
    You should readdir instead. Also, if this is running from a CGI (I guess that's what timing out is referring to), then make sure to give the client a few bytes of data every now and then so it doesn't give up waiting.

    Makeshifts last the longest.

      Now I know that readdir exists (thanks Aristotle!) I was able to RTFM and put it into practice. Those functions that do not require me to open the files have improved stunningly. So much so that the client rang me at home to thank me and backed it up with an email full of exclamation marks.

      Now I have to further improve the processes that have to read all the files. But I guess that's another thread!

      readdir is less efficient than glob if a subset of the directory contents is sought.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

        Apart from the fact that the OP isn't looking for a subset.
        #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; chdir shift or die; cmpthese -5 => { glob => sub { 1 while glob '*'; }, readdir => sub { opendir my $dh, '.' or die; 1 while readdir $dh; }, }; __END__ $ mkdir x ; cd x ; touch `seq 1 20000` ; cd .. $ perl glob_vs_readdir.pl x Benchmark: running glob, readdir for at least 5 CPU seconds... glob: 6 wallclock secs ( 4.02 usr + 1.07 sys = 5.09 CPU) @ 5.30 +/s (n=27) readdir: 6 wallclock secs ( 3.93 usr + 1.37 sys = 5.30 CPU) @ 65.28 +/s (n=346) Rate glob readdir glob 5.30/s -- -92% readdir 65.3/s 1131% --

        Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://296295]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2024-04-19 15:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found