Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

A warning about passing mountpoints as arguments to directory recursion programmes

by Don Coyote (Pilgrim)
on Apr 26, 2013 at 14:05 UTC ( #1030835=perlmeditation: print w/replies, xml ) Need Help??

taint glad you mentioned this File::Find seems grossly inefficient for performing simple file tasks

Honing my understanding and skills, I have a script which accesses a mounted windows system and clears out temporary folders. Great for learning about recursion etc...

I set about providing myself with some default filepaths and mountpoints, lest i decide not to provide them as arguments to my script. The starting directories are in the file and the script runs through the paths and recursively unlinks the files under those directories.

I developed a nice little help parser

die 'do not run this code'; #get arguments; my ($filepathlist,$mountpoint)= @ARGV; die 'usage & defaults' if $filepathlist =~ /^-{0,2}h(elp)?\s*$/;

and set my default paths

die 'do not run this code'; $filepathlist = 'home/Documents/directorieslist' unless $filepathlist; $mountpoint = '/mountpoint/windows/' unless $mountpoint; push @dirpaths readdir($filepathlist); # homemade file::find with custom unlinking behaviour # simplified for this example sub recursivelyunlinkfiles{ while(my $path = shift @dirpaths){ $path = $mountpoint.$path; unlink if -f $path; &recursevilyunlinkfiles if -d $path; } }

so this works if no arguments are provided and sets the filepath if one argument is provided, using the default mountpoint.

I went back to clear up the edge case of an orphaned hyphen whilst requesting the usage info. At which point I realised the tremendous disaster which lay ahead, had I tested this on a singular argument consisting of either a filepath or mountpoint. Of course I had put die statements everywhere like a keen domino course constructor interspersing the frail light blocks with large heavy lumps of immutable iron. Debugger runs the code and dieing in source is a good way to reinforce your breakpoints.

Can you see what might conceivably go wrong here?

#use default file, but supply mountpoint; > perl ./ /mount/path

What my dynamic and helpful code did not yet consider was that a mountpoint is a directory path, And had i tested sending no dirpathfile through, the mountpoint would have been set to the $filepath and $mountpoint now being undefined will have been set to the defaulted mountpoint.

My script would have readdir the mountpath and then recursed through my mounted os system, happily unlinking several years of 'i must back this up soon' data

pass me some more rope please anyone?

Replies are listed 'Best First'.
Re: A warning about passing mountpoints as arguments to directory recursion programmes
by davido (Archbishop) on Apr 26, 2013 at 14:20 UTC

    Anytime I prepare to act upon several files that are selected programatically (such as using File::Find or File::Find::Rule), I first disable the action code, and switch it to logging code so that I can see what files would be acted upon if the action code were live. Anytime I forget to do that I end up creating a mess.


Re: A warning about passing mountpoints as arguments to directory recursion programmes
by ambrus (Abbot) on Apr 30, 2013 at 19:37 UTC

    Maybe it's just that I'm tired, but I don't see how this code is supposed to work. I have multiple problems.

    Firstly, the recursivelyunlinkfiles function tries to call itself multiple times, but doesn't have an argument that's changed for the inner calls. What's the point of the recursion then?

    Secondly, you're calling unlink without an argument so it defaults to unlinking $_, but I don't see where you're setting $_.

    Thirdly, you don't filter out . and .. from the files readdir returns. If you actually recursed on the directory tree, this would lead to either a quick infinite loop, or a dangerous code that traversed the entire file system.

    PS. if you just want a recursive deletion function, then take a look at File::Path or File::Path::Simple.

Re: A warning about passing mountpoints as arguments to directory recursion programmes
by sundialsvc4 (Abbot) on May 16, 2013 at 21:24 UTC

    One very painful lesson-learned that I have gathered from a great many programming / operating-system environments is that, especially if you intend to alter the filesystem structure, you should:   “iterate first, and act second.”   Any sort of change to the filesystem topology can be very disruptive to a “walker” ... and even-slightly-deep nested directory structures can utterly-bamfoozle many, say... “hidden, operating-system imposed” limits.

    What I customarily try to do is to walk through some reasonable subset of the territory that is “downstream of” wherever-it-is that I started, accumulating along the way a list of “the completely-qualified filenames of something that I’ve decided I want to act upon,” and, separately, a list of “(more) places I want to get to soon.”   I completely run-thorough that walk first, then switch gears and act upon the files.   Having completed that task, I switch back to exploring more locations, popping entries off of the “to-do” list and silently ignoring any that are no longer there.

    This approach enables the filesystem-walker to traverse over a structure that can be relied-upon not to change (not by my hand, anyway ...), and it generally works quite well.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://1030835]
Approved by Corion
Front-paged by Old_Gray_Bear
and the universe expands...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2017-01-23 17:34 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (194 votes). Check out past polls.