How to map a directory tree to a perl hash tree

Rudif has asked for the wisdom of the Perl Monks concerning the following question:

Everyone knows how to walk a directory tree and collect the file paths into a flat list, perhaps filtering them along the way.

However, I want to collect those paths into a tree of hashes, where keys would be file or subdirectory names and values would be a scalar for a file and a hashref for a subdirectory.
Once in a tree, I could apply recursive programs to do whatever I need with my files.

Here is my first try, with sample output (and a directory browser in 3 lines of perl :-).

#! perl -w

use strict;
use File::Find;

my $dirth = dirTreeHash(shift);

sub dirTreeHash {
    my $dir = shift;
    return unless defined $dir && -d $dir;
    $dir =~ s#\\#/#g;    # Win32 :-(
    my $dirth = {};
    find sub {
        return if -d;
        eval  join '', '$dirth->', ( map { '{\'' . $_ . '\'}' } split 
+'/', $File::Find::name ), '=""';
        warn $@ if $@;
    }, $dir;
    $dirth;
}


# dump it 
use Data::Dumper;
$Data::Dumper::Indent = 1;
print Dumper $dirth;


# browse it
use Tk::ObjScanner;  
MainWindow->new->ObjScanner(caller => $dirth)->pack;                  
+              
Tk::MainLoop;


__END__


H:\devperl\perlmonks\TryDirTreeHash.pl H:\devperl\PerlMonksChat2
$VAR1 = {
  'H:' => {
    'devperl' => {
      'PerlMonksChat2' => {
        'README' => '',
        'ChangeLog' => '',
        'Makefile' => '',
        'getchat.pl' => '',
        'getpage.pl' => '',
        'getchat.old.pl' => '',
        'test' => {
          'testnn.pl' => '',
          'nodetypes' => ''
        },
        'PerlMonks' => {
          'Users.pm' => '',
          'NewestNodes.pm' => '',
          'Chat.pm' => ''
        },
        'Noname6.pl' => '',
        'PerlMonks.pm' => '',
        'pmpager.pl' => '',
        'PerlMonksChat.pm' => '',
        'pmchat' => ''
      }
    }
  }
};
[download]

Question:
What other/better ways to perform this dir-to-hash mapping would you suggest?

Rudif

Comment on How to map a directory tree to a perl hash tree Download Code

Replies are listed 'Best First'.
Re: How to map a directory tree to a perl hash tree by japhy (Canon) on Mar 10, 2001 at 20:42 UTC
I use the SNAP (Stacked Nodes Affect Parent) method. I use it for my YAPE modules. The process is rather simple. Make two variables refer to the same reference, and you create an array for stacking. `my $node = my $tree = []; my @stack;` [download] You never modify `$tree` directly -- you do all the manipulation on `$node`, and when you need to recurse, you add a new "field" to `$node`; then you push the `$node` onto the `@stack`, and then you set `$node` equal to the field: `if ($token eq '(') { push @{ $node }, []; push @stack, $node; $node = $node->[-1]; # the new empty array ref }` [download] When you return from the recursion, you get the top value from the `@stack`: `if ($token eq ')') { $node = pop @stack; }` [download] Now, `$node` contains what it was before, PLUS any modifications to that empty array reference. And because `$node` and `$tree` refer to the same data, modifying the data in `$node` modifies `$tree`. It's important to realize that saying: `$a = $b = []; $a = 2;` [download] does not affect `$b` in any way. It's the changing of the data referred to by `$a` that affects `$b`. Without further ado, here is my implementation of a directory tree structure, using SNAP. `#!/usr/bin/perl -w use File::Find; use Data::Dumper; use strict; $Data::Dumper::Indent = 1; build_tree(my $tree, shift); print Dumper $tree; { sub build_tree { my $node = $_[0] = {}; my @s; find( sub { # fixed 'if' => 'while' -- thanks Rudif $node = (pop @s)->[1] while @s and $File::Find::dir ne $s[-1][0] +; return $node->{$_} = -s if -f; push @s, [ $File::Find::name, $node ]; $node = $node->{$_} = {}; }, $_[1]); $_[0]{$_[1]} = delete $_[0]{'.'}; } }` [download] The recursion is not handled by me, it's handled by `find()`, so I had to kludge my way into faking the start and end of a recursion. But it works. `japhy` -- Perl and Regex Hacker	[reply] [d/l] [select]
Re: Re: How to map a directory tree to a perl hash tree by Rudif (Hermit) on Mar 11, 2001 at 00:21 UTC
But it works. Indeed, it works, it is free of evil string eval, and you explained it in English. And I still have 0.5 * $weekend ahead of me, to let it sink in. Thanks! Rudif PS. Your OGRE is awesome. But it adds an extra `(?-imsx:)` around the regex that I type in, e.g. I ask `([\.\d]+)` and it returns and explains `(?-imsx:([\.\d]+))` - why? Update You qr // the submitted regex, right? Just guessing, I did not look into the source code.	[reply] [d/l] [select]
Re: Re: Re: How to map a directory tree to a perl hash tree by japhy (Canon) on Mar 11, 2001 at 09:56 UTC
I don't think so -- that would stop me from being able to point out the errors in a regex my own way. I add the `(?:...)` around it for several reasons. First, to honor any of the `/imsx` switches you have on. Second, so that the regex has ONE root node (easier to parse). Third, so the regex is "its own regex". (That third reason is probably lame.) `japhy` -- Perl and Regex Hacker	[reply]
Re: Re: How to map a directory tree to a perl hash tree by Rudif (Hermit) on Mar 13, 2001 at 04:27 UTC
But it works. Er, almost :-( There is a bug in `sub build_tree`. To fix it, you have to replace the `if` that pops the stack by a `while` . The test below shows the difference. Rudif #! perl -w use strict; use File::Find; use File::Path; use Data::Dumper; $Data::Dumper::Indent = 1; my $dir = './buildtreetestdir'; mkpath "$dir"; mkpath "$dir/dir1"; mkpath "$dir/dir1/dir11"; mkpath "$dir/dir2"; mkpath "$dir/dir2/dir21"; build_tree(my $tree, $dir); build_tree_fixed(my $tree_fixed, $dir); print Data::Dumper->Dump( [ $tree, $tree_fixed ], [ qw / tree tree_fix +ed / ] ); { sub build_tree { my $node = $_[0] = {}; my @s; find( sub { $node = (pop @s)->[1] if @s and $File::Find::dir ne $s[-1][0]; return $node->{$_} = -s if -f; push @s, [ $File::Find::name, $node ]; $node = $node->{$_} = {}; }, $_[1]); $_[0]{$_[1]} = delete $_[0]{'.'}; } } { sub build_tree_fixed { my $node = $_[0] = {}; my @s; find( sub { $node = (pop @s)->[1] while @s and $File::Find::dir ne $s[-1][0] +; return $node->{$_} = -s if -f; push @s, [ $File::Find::name, $node ]; $node = $node->{$_} = {}; }, $_[1]); $_[0]{$_[1]} = delete $_[0]{'.'}; } } __END__ $tree = { './buildtreetestdir' => { 'dir2' => { 'dir21' => {}, 'dir1' => { 'dir11' => {} } } } }; $tree_fixed = { './buildtreetestdir' => { 'dir1' => { 'dir11' => {} }, 'dir2' => { 'dir21' => {} } } }; [download]	[reply] [d/l] [select]
Re: Re: Re: How to map a directory tree to a perl hash tree by japhy (Canon) on Mar 13, 2001 at 04:52 UTC
Sorry. Thanks for the fix. I think I actually fixed it like that after I posted the code, but forgot to fix it here. `japhy` -- Perl and Regex Hacker	[reply]
Re: How to map a directory tree to a perl hash tree by merlyn (Sage) on Mar 10, 2001 at 20:27 UTC
This line: `eval join '', '$dirth->', ( map { '{\'' . $_ . '\'}' +} split '/', $File::Find::name ), '=""';` [download] is dangerous for all the same reasons as it was the last two times this was brought up. See the thread containing my "Re: Re: stringification" for the most recent time, and follow the link contained within it to go to the older thread which looks a lot more like your code. In short (as I'm also saying in two other threads right now as well), AVOID STRING EVAL -- Randal L. Schwartz, Perl hacker	[reply] [d/l]
Re: Re: How to map a directory tree to a perl hash tree by Rudif (Hermit) on Mar 11, 2001 at 00:52 UTC
AVOID STRING EVAL I will. I did have an inkling that a solution involving multiple references to the same data item (a nascent hash entry), perhaps combined with recursion, should be possible, but I could not see how it would fit together. But then as I wrote some test cases, initializing multilevel hashes, it dawned on me that I could construct similar Perl code on the fly and eval it - my first ever creative use for string eval. I must admit, I took the easy way out .-( I probably cost me a --, but I'm wiling to pay my way. Rudif	[reply]
Re: How to map a directory tree to a perl hash tree by Masem (Monsignor) on Mar 10, 2001 at 19:23 UTC
Might not a recursive method of loading this hash work better? Namely, if the 'file' is a directory, then it's entry would just be something like `$dirth{ $dir } = dirTreeHash( $dir )`. Dr. Michael K. Neylon - mneylon-pm@masemware.com \|\| "You've left the lens cap of your mind on again, Pinky" - The Brain	[reply] [d/l]
(tye)Re2: How to map a directory tree to a perl hash tree by tye (Sage) on Mar 11, 2001 at 00:52 UTC
Yes, but then you couldn't use File::Find (one of several problems with File::Find's call-back interface). - tye (but my friends call me "Tye")	[reply]
Re: How to map a directory tree to a perl hash tree by DeaconBlues (Monk) on Mar 11, 2001 at 12:10 UTC
Here's another way. I handle the recursion myself, without File::Find. I used your example as a model. Is this what you want or am I missing something? `#!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $dirth = &dirTreeHash(shift); sub dirTreeHash { my $dir = shift; return unless (defined $dir && -d $dir); $dir =~ s#\\#/#g; # Win32 :-( my $dirth = {}; opendir(DIR, $dir) \|\| die "Unable to opendir $dir\n"; my @files = grep {!/^\.\.?$/} readdir(DIR); closedir(DIR); map {$dirth->{$_} = (-d "$dir/$_" ? &dirTreeHash("$dir/$_") : '')} @ +files; return $dirth; } $Data::Dumper::Indent = 1; print Dumper $dirth;` [download]	[reply] [d/l]
Re: Re: How to map a directory tree to a perl hash tree by merlyn (Sage) on Mar 12, 2001 at 15:25 UTC
This one goes off into gah-gah land on a symlink, including an infinite recurse if the symlink points at a parent. Just another reason to prefer File::Find. -- Randal L. Schwartz, Perl hacker	[reply]
Re: Re: Re: How to map a directory tree to a perl hash tree by DeaconBlues (Monk) on Mar 13, 2001 at 01:35 UTC
Sorry Randal. How about if I change the line from `map {$dirth->{$_} = (-d "$dir/$_" ? &dirTreeHash("$dir/$_") : '')} @fi +les;` [download] to `map {$dirth->{$_} = (-d "$dir/$_" ? &dirTreeHash("$dir/$_") : '') unle +ss (-l "$dir/$_")} @files;` [download]	[reply] [d/l] [select]
Re: Re: How to map a directory tree to a perl hash tree by Rudif (Hermit) on Mar 12, 2001 at 01:44 UTC
Is this what you want or am I missing something? Yes, it does exactly what I want. Now we have 3 solutions to compare: Yours is perhaps the most natural - a recursive solution for a recursive problem. japhy's solution neatly replaces recursion by iteration and enables the use of File::Find::find. Mine , deprecated because based on string eval, uses the information extracted with File::Find::find directly. All have similar complexity. Performance will probably be dominated by the disk access operations, not by the algorithm used. Rudif	[reply]
Re: How to map a directory tree to a perl hash tree by brother ab (Scribe) on Mar 11, 2001 at 13:13 UTC
May be it would be suitable to use tied hash (see e.g. Tie::Dir)? Do you really need to load all the directory tree into the memory before scanning? -- brother ab	[reply]
Re: Re: How to map a directory tree to a perl hash tree by Rudif (Hermit) on Mar 12, 2001 at 02:15 UTC
May be it would be suitable to use tied hash(see e.g. Tie::Dir)? Tie::Dir handles files in one directory. It should be posible to write a Tie::DirTree along similar lines. At times I used Tie modules (Win32::OLE, Tie::Registry), but at this stage I don't have a feel for the advantages of a Tie module relative to a plain module. I suppose a Tie module adds syntactic convenience. Can you or any other Monk elaborate on this? Do you really need to load all the directory tree into the memory before scanning? It depends on what I want to do. A simple job (e.g. count the number of .html files and compute their total size) can be done simply in the find's wanted subroutine. For more elaborate jobs it might be better to first extract the relevant directory and file information into a Perl structure and then work on that structure, for example: generate a graphical representation of the directory tree for a browser (javascript, html, chm) compare the perl/html tree with perl/lib and per/site/lib trees to find .pm and .pod files that have no corresponding .html file, and make that file Rudif	[reply]

Back to Seekers of Perl Wisdom