Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

How to Get the Last Subdirectories

by bichonfrise74 (Vicar)
on Mar 11, 2010 at 20:54 UTC ( #828135=perlquestion: print w/ replies, xml ) Need Help??
bichonfrise74 has asked for the wisdom of the Perl Monks concerning the following question:

Given the output from the code below, how would I get the final subdirectories for a certain directory?
#!/usr/bin/perl use strict; use File::Find; find( sub { print "$File::Find::name \n" if -d }, '/tmp/a' );
Below is the output from the code above.
/tmp/a /tmp/a/b /tmp/a/b/c /tmp/a/b/c/e /tmp/a/b/d /tmp/a/b/d/g /tmp/a/b/d/g/h /tmp/a/b/d/g/i /tmp/a/b/k
But the output that I want is
/tmp/a/b/c/e /tmp/a/b/d/g/h /tmp/a/b/d/g/i /tmp/a/b/k
So, for example, I do not need the '/tmp/a' because it is still has sub-directories underneath it.

Thanks in advance.

Comment on How to Get the Last Subdirectories
Select or Download Code
Re: How to Get the Last Subdirectories
by Anonymous Monk on Mar 11, 2010 at 21:05 UTC
    What have you tried?
      Unfortunately, I have not tried anything yet. It's more like I am just thinking of how I should do it. Below is the idea that I was thinking.

    • Create a recursive hash and store each subdirectory as a key.
    • Loop through the recursive hash and place the 'directory' in a new hash. So, if the key is already in the new hash, then continue to the next one.

      But the above idea might be making things complicated. I was wondering if there is a simple approach to this or I may just be over complicating the problem.
        You are basically interested in those directories in your tree which have no subdirectories; so the following algorithm should work:
        1. Initially create an empty hash %leaf_directories
        2. Whenever File::Find drops you into a directory $d, do the following:
          1. Remove the parent directory of $d from the hash, i.e. if $d contains the full path, do a delete $leaf_directories{dirname($d)}. Of course this will fail occasionally (because there is no corresponding entry), but we ignore this.
          2. Add $d to your hash, i.e. $leaf_directories{$d}=1
        In the end, keys $leaf_directories should be the list of the directories without subdirectories.


        -- 
        Ronald Fischer <ynnor@mm.st>
Re: How to Get the Last Subdirectories
by toolic (Chancellor) on Mar 11, 2010 at 21:36 UTC
    One brute-force method is:
    • Store all your paths into an array.
    • Two nested loops through the array
    • Create a hash of parent directories, using index.
    • Loop through your hash and print just the leaf directories.
    use strict; use warnings; my @dirs; while (<DATA>) { chomp; push @dirs, $_; } my %parents; for my $dir1 (@dirs) { for my $dir (@dirs) { if (index($dir1, $dir) == 0) { # $dir1 is a substring of $dir, starting at pos 0 $parents{$dir}++; } } } for (keys %parents) { print "$_\n" if $parents{$_} == 1 } __DATA__ /tmp/a /tmp/a/b /tmp/a/b/c /tmp/a/b/c/e /tmp/a/b/d /tmp/a/b/d/g /tmp/a/b/d/g/h /tmp/a/b/d/g/i /tmp/a/b/k
    Prints:
    /tmp/a/b/d/g/h /tmp/a/b/d/g/i /tmp/a/b/k /tmp/a/b/c/e
    I used this technique to solve a similar problem. Hopefully, our fellow monks will provide a more elegant solution.
Re: How to Get the Last Subdirectories
by liverpole (Monsignor) on Mar 11, 2010 at 21:47 UTC
    Hi bichonfrise74,

    I would just use opendir and readdir recursively to scan the subdirectories yourself, and save the subdirectory only when it doesn't contain any subordinate subdirectories.

    For example:

    use strict; use warnings; use FileHandle; my $h_dirs = terminal_subdirs("/tmp/a"); my @dirs = sort keys %$h_dirs; print "Terminal Directories:\n", join("\n", @dirs); sub terminal_subdirs { my ($top, $h_results) = @_; $h_results ||= { }; my $fh = new FileHandle; opendir($fh, $top) or die "Arrggghhhh -- can't open '$top' ($!)\n" +; my @files = readdir($fh); closedir $fh; my $nsubdirs = 0; foreach my $fn (@files) { next if ($fn eq '.' or $fn eq '..'); my $full = "$top/$fn"; if (!-l $full and -d $full) { ++$nsubdirs; terminal_subdirs($full, $h_results); } } $nsubdirs or $h_results->{$top} = 1; return $h_results; }

    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re: How to Get the Last Subdirectories
by rubasov (Friar) on Mar 11, 2010 at 22:06 UTC
    This code is probably a little too tricky, however if I'm right, it just does what it needs to.
    #! /usr/bin/perl use 5.010; use strict; use warnings; use File::Find; my $dir = $ARGV[0] // q{.}; sub deepest { return if not -d; state $prev_dir = ''; say if index $prev_dir, $_; $prev_dir = $_; } find( { wanted => \&deepest, no_chdir => 1, bydepth => 1, }, $dir );
    The main idea behind this code is the following: if you traverse your directory structure in depth-first order, then you only need to check whether your current directory path is not a prefix (a slice starting at position 0) of the previous directory path. If it's not a prefix of the previous value, then print it.

    To ease the understanding: your sample dir structure in depth-first order:

    a/b/d/g/h a/b/d/g/i a/b/d/g # this is a prefix of the previous, so we don't want it a/b/d a/b/k # this is not a prefix of the previous, so we want it a/b/c/e a/b/c a/b a

      ++, nice idea.  Here's a variation of your approach, making use of the postprocess option:

      #!/usr/bin/perl -l use File::Find; my @dirs; find( { wanted => sub {}, postprocess => sub { push @dirs, $File::Find::dir if index $dirs[-1]||"", $File::Find::dir; }, }, '/tmp/a' ); print for @dirs;
Re: How to Get the Last Subdirectories
by FunkyMonk (Canon) on Mar 11, 2010 at 22:27 UTC
    use Data::Dump 'pp'; my @dirs = qw( tmp/a tmp/a/b tmp/a/b/c tmp/a/b/c/e tmp/a/b/d tmp/a/b/d/g tmp/a/b/d/g/h tmp/a/b/d/g/i tmp/a/b/k ); my %empty; for (@dirs) { $empty{$_} = 1; # assume empty s!/[^/]+$!!; # find parent delete $empty{$_}; # and remove it } pp keys %empty; __END__ ("tmp/a/b/k", "tmp/a/b/d/g/h", "tmp/a/b/c/e", "tmp/a/b/d/g/i")


    Unless I state otherwise, all my code runs with strict and warnings
Re: How to Get the Last Subdirectories
by sigma8 (Initiate) on Mar 11, 2010 at 22:55 UTC

    I don't think there is a way to do it inline in the File::Find subroutine without recursion, but if you can wait until the end, I think this should do it:

    #!/usr/bin/perl use strict; use File::Find; my %seen; find( sub { if (-d) { $seen{$File::Find::name}++; delete $seen{$File:: +Find::dir} }; }, '/tmp/a' ); print join "\n", (keys %seen, undef);
    This creates a hash key for every directory, and deletes every key that is the same name as the parent. Therefore directories who have no children will never get deleted.
Re: How to Get the Last Subdirectories
by pemungkah (Priest) on Mar 12, 2010 at 00:50 UTC
    I started by setting up a sample set of test directories:
    [mcmahon@joe-desk ~]$ ls -R ./example/ ./example/: file nonempty_files_only nonempty_has_dirs ./example/nonempty_files_only: file1 file2 ./example/nonempty_has_dirs: file1 one two ./example/nonempty_has_dirs/one: ./example/nonempty_has_dirs/two:
    That's a directory containing files and other (nonempty) directories, one containing only files, and one containing a file and two empty directories.
    sub dive { my($d) = shift; return if ! -d $d; my @contents = glob("$d/*"); return $d unless @contents; my @below = map { dive($_) } @contents; return @below ? @below # Stuff below qualifies, this doesn't : $d; # Nothing below qualifies, this does } $d = './example'; print join ", ", dive($d),"\n";
    This prints
    ./example/nonempty_files_only, ./example/nonempty_has_dirs/one, ./exam +ple/nonempty_has_dirs/two
    The tricky bit is postponing the decision about whether the current directory is good until you've seen if any subdirectories of it qualify.

    Edit: Removed the majority of the comments as they were actually obscuring how short this is; renamed @queue as it was a leftover from a previous, longer, iterative version.

Re: How to Get the Last Subdirectories
by admiral_grinder (Pilgrim) on Mar 12, 2010 at 20:01 UTC
    What the hell, here is my stab at this. It doesn't work when it comes across a unreadable object such as 'C:\System Volume Information', but that might be a issue in the Path::Class::Dir than my code.
    #!perl use strict; use warnings; use Path::Class; #file(), dir() use Cwd; #getcwd() my $start_dir = dir( $ARGV[0] || getcwd() ); #print "DEBUG: Scanning $start_dir\n"; $start_dir->recurse( callback => \&report_leaf_dirs ); sub report_leaf_dirs { my $object = shift; #print "DEBUG: processing $object\n"; return unless $object->is_dir(); # Test to see if we can read it unless( $object->open() ) { warn "Unable to open $object\n"; return; } # Test for sub directories foreach my $child ( $object->children() ) { return if $child->is_dir(); } print "$object\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://828135]
Approved by toolic
Front-paged by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2014-07-10 04:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (198 votes), past polls