Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

RE: List non-matching files

by eak (Monk)
on Aug 19, 2000 at 23:00 UTC ( #28665=note: print w/replies, xml ) Need Help??


in reply to List non-matching files

In the spirit of right tool for the right job, I think this can be done a lot faster by just using 'find'. Take a look at this.
find . -type f -not -name '*.html' -maxdepth 1 -exec rm -rf {} \;
--eric

Replies are listed 'Best First'.
RE: RE: List non-matching files
by fundflow (Chaplain) on Aug 19, 2000 at 23:19 UTC
    Well, not really.

    Think about:

    > nom *.html *.jpg `list-directories`

    Got the picture?

RE (tilly) 2: List non-matching files
by tilly (Archbishop) on Aug 20, 2000 at 00:17 UTC
    In general pure Perl solutions tend to be faster than find for all of the reasons that Perl usually beats shell scripting. (You don't have to keep on launching processes.) In this case it comes down to launching one rm and passing it a lot of filenames vs launching an rm per file. Guess which I think is faster?

    However find has one huge advantage. It is one of the few ways to get around limitations with listing large numbers of files in shell scripts. The nom script given doesn't do that.

    A second advantage is that while find has a more complex API, it is also more flexible... :-)

      find . -type f -not -name '*.html' -maxdepth 1 -print | xargs rm -rf;

      The above only launches three processes (well, actually a few more if xargs decides there are too many files), and since it's I/O bound, I doubt a Perl based solution would be significantly faster (and personally, my wager is that it would be slower).

      However, I agree that shell scripting would be slower in general than Perl, for the reason of process creation. But I don't think this case counts.

      Ciao,
      Gryn :)

        And it freaks badly if you have any filenames with whitespace in them, especially a newline. This same thing in Perl works just fine in one process:
        #!/usr/bin/perl opendir DOT, "."; unlink grep { -f and not /\.html$/ } readdir DOT;

        -- Randal L. Schwartz, Perl hacker

        In case you care:

        find . -type f -not -name '*.html' -maxdepth 1 -exec rm '{}';

        Does the same job with only one process.

        For me, performance doesn't matter that much. I usually deal with 100..2000 files each time and running time isn't much different.

        One should measure the total time from the split-second in which your brain decided what you want to do and until you see the next command prompt :)

        This is why it's useful to have simple basic blocks (with short names :) that do the job.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://28665]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (3)
As of 2022-01-28 23:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (74 votes). Check out past polls.

    Notices?