http://www.perlmonks.org?node_id=296415


in reply to Re: Re: Efficient processing of large directory
in thread Efficient processing of large directory

You should readdir instead. Also, if this is running from a CGI (I guess that's what timing out is referring to), then make sure to give the client a few bytes of data every now and then so it doesn't give up waiting.

Makeshifts last the longest.

  • Comment on Re^3: Efficient processing of large directory

Replies are listed 'Best First'.
Re: Re^3: Efficient processing of large directory
by Elliott (Pilgrim) on Oct 05, 2003 at 15:57 UTC
    Now I know that readdir exists (thanks Aristotle!) I was able to RTFM and put it into practice. Those functions that do not require me to open the files have improved stunningly. So much so that the client rang me at home to thank me and backed it up with an email full of exclamation marks.

    Now I have to further improve the processes that have to read all the files. But I guess that's another thread!

Re: Re^3: Efficient processing of large directory
by BrowserUk (Patriarch) on Oct 04, 2003 at 00:11 UTC

    readdir is less efficient than glob if a subset of the directory contents is sought.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
    If I understand your problem, I can solve it! Of course, the same can be said for you.

      Apart from the fact that the OP isn't looking for a subset.
      #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; chdir shift or die; cmpthese -5 => { glob => sub { 1 while glob '*'; }, readdir => sub { opendir my $dh, '.' or die; 1 while readdir $dh; }, }; __END__ $ mkdir x ; cd x ; touch `seq 1 20000` ; cd .. $ perl glob_vs_readdir.pl x Benchmark: running glob, readdir for at least 5 CPU seconds... glob: 6 wallclock secs ( 4.02 usr + 1.07 sys = 5.09 CPU) @ 5.30 +/s (n=27) readdir: 6 wallclock secs ( 3.93 usr + 1.37 sys = 5.30 CPU) @ 65.28 +/s (n=346) Rate glob readdir glob 5.30/s -- -92% readdir 65.3/s 1131% --

      Makeshifts last the longest.