Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

It's worth noting that if your trying to find a subset of the files contained in a subdir, rather than processing them all, then using <*.txt> is considerably faster that using either File::Find or opendir/readdir/closedir. At least that is the case under Win32 as the wildcard matching is done by the OS and only those files matching are past back.

In the examples below, the first comparison shows selecting all 17576 files in a subdirectory. In this case, glob and File::Find come out pretty much even.

In the second comparison, a subset of 676 files is selected from the 17000 using a wildcard. In this case, the glob runs 650% faster as it is only processing the 676, rather than looping over the whole 17000+.

Of course, if any real processing was being done rather than just counting the files, the difference would rapidly disappear.

In this case, the OP's use of the word "efficient" was most likely to do with the memory used by slurping all 17000 names in to memory rather than speed, but if not all those 17000 file are .txt files, the time saved might be worth having.

#! perl -slw use strict; use Benchmark qw[ cmpthese ]; use File::Find; our( $dir, $glob, $re_glob ); my %tests = ( GLOB => q[ my $count; $count++ while <${dir}/${glob}>; + print 'GLOB: ', $count; ], FIND => q[ my $count; find( sub{ m[$re_glob] and $count++ }, $dir +); print 'FIND: ', $count; ], ); ( $dir, $glob, $re_glob ) = ( 'bigdir', '/*.txt', qr[\.txt$] ); cmpthese( 3, \%tests ); ( $dir, $glob, $re_glob ) = ( 'bigdir', '*A.txt', qr[A\.txt$] ); cmpthese( 10, \%tests ); __END__ P:\test>glob-ff FIND: 17576 FIND: 17576 FIND: 17576 GLOB: 17576 GLOB: 17576 GLOB: 17576 s/iter FIND GLOB FIND 29.6 -- -1% GLOB 29.4 1% -- FIND: 676 FIND: 676 FIND: 676 FIND: 676 FIND: 676 FIND: 676 FIND: 676 FIND: 676 FIND: 676 FIND: 676 GLOB: 676 GLOB: 676 GLOB: 676 GLOB: 676 GLOB: 676 GLOB: 676 GLOB: 676 GLOB: 676 GLOB: 676 GLOB: 676 s/iter FIND GLOB FIND 29.6 -- -87% GLOB 3.97 646% --

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.


In reply to Re: Re: Efficient processing of large directory by BrowserUk
in thread Efficient processing of large directory by Elliott

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others avoiding work at the Monastery: (5)
    As of 2014-09-18 07:23 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      How do you remember the number of days in each month?











      Results (109 votes), past polls