Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Using File::Find to create a hash of directory structure.

by AnaximanderThales (Novice)
on Dec 19, 2015 at 22:00 UTC ( #1150770=perlquestion: print w/replies, xml ) Need Help??

AnaximanderThales has asked for the wisdom of the Perl Monks concerning the following question:

Oh Holy Monks, I seek thy wisdom.

Below is my code and output. My attempt is to create a hash of my directory listing. The parse_tree sub does exactly that, but I'm not happy with the results. I found this function from -- er, I believe here -- though it could have been SO.

Regardless, my issue is with the way it's handling the root path. I do see a difference when I exclude the trailing '/' on my path, however an issue still remains.

My issue is with the '' => 'home' => 'user' => 'bin' structure. If I exclude the trailing '/', the files under there are in the root of the hash, but then all folders fall under '' => 'TEST', etc.

My additional issue is that I have not been able to figure out what the parse_tree function is actually doing. I'm just not skilled enough to understand these lines:

$r = $r->{$_} ||= {} for split m|/|, $tmp; #/ $dl{$name} ||= $r;
nor how the hash %root is being populated.

my goal is to have the data appear like this with Dumper:

$VAR1 = {
          'file1' => 'file01',
          'file2' => 'file02'
          TEST => {
                     'file1' => 'file01'
                  }
          test => {
                     'file3' => 'file03',
                     subfolder => {
                                    'file1' => 'file01'
                                  }
                     'file2' => 'file02',
                     'file1' => 'file01'
                  }
             }
       }
I'd like it this way because I won't necessarily know the directory that's being queried, and while I could certainly find more code and make it look like that, I'd prefer to understand what this is doing, and get ideas on what I need to do to get the format the way I'd like to see it from dumper. This will help me learn, and will also allow me to put comments on the code that i understand for when I look at this later on. Thank you for your time, @
#!/usr/bin/perl # # Initial perl settings -- Using strict and warning # to ensure I'm writing proper perl. use strict; use warnings; use diagnostics; use Data::Dumper; use File::Find; my $dirHash = {}; # Hash file of directory my $testDir = $ENV{"HOME"} . "/bin/"; # Grab a list of files and put into hash $dirHash = parse_tree("$testDir"); # Dump -- for testing print "directory hash dump\n" . Dumper $dirHash; # Functions sub parse_tree { my ($root_path) = @_; my %root; my %dl; my %count; my $path_checker = sub { my $name = $File::Find::name; if (-d $name ) { my $r = \%root; my $tmp = $name; $tmp =~ s/^\Q$root_path\E//; $r = $r->{$_} ||= {} for split m|/|, $tmp; #/ $dl{$name} ||= $r; } elsif (-f $name) { my $dir = $File::Find::dir; my $key = "file". ++$count{ $dir }; $dl{$dir}{$key} = $_; } }; find($path_checker, $root_path); return \%root; } exit 0;
output -- $ENV{"HOME"} . "/bin/";
directory hash dump
$VAR1 = {
          '' => {
                  'home' => {
                              'user' => {
                                              'bin' => {
                                                         'file2' => 'results.txt',
                                                         'file5' => 'rsync_logs',
                                                         'file1' => 'sdu.sh',
                                                         'file3' => 'parse_tree_test.pl',
                                                       }
                                            }
                            }
                },
          'TEST' => {
                      'file1' => 'file05',
                      'file2' => 'file03',
                      'file4' => 'file04',
                      'file3' => 'file02',
                      'file5' => 'file01'
                    },
          'test' => {
                      'file1' => 'file02',
                      'file2' => 'file01',
                      'subtest' => {
                                     'file2' => 'file06',
                                     'file1' => 'file05'
                                   }
                    }
        };
output -- $ENV{"HOME"} . "/bin";
directory hash dump
$VAR1 = {
          '' => {
                 'TEST' => {
                              'file1' => 'file05',
                              'file2' => 'file03',
                              'file4' => 'file04',
                              'file3' => 'file02',
                              'file5' => 'file01'
                           },
                 'test' => {
                              'file1' => 'file02',
                              'file2' => 'file01',
                              'subtest' => {
                                              'file2' => 'file06',
                                              'file1' => 'file05'
                                         }
          'file2' => 'results.txt',
          'file5' => 'rsync_logs',
          'file1' => 'sdu.sh',
          'file3' => 'parse_tree_test.pl',
        };

Replies are listed 'Best First'.
Re: Using File::Find to create a hash of directory structure.
by Discipulus (Abbot) on Dec 19, 2015 at 22:20 UTC
    welcome AnaximanderThales

    I've not so much time now, but recursive directory parse is the oldest question here at PM:
    see Recursive Directory print Descending a directory tree, returning a list of files as recent examples. In both I point to the tachyon's recursive.. eehm iterative solution: it is simple and you can extend it at will. You'll find the original tachyon's post with plain explanation here

    The code you point to is a little obscure, indeed
    my $r = \%root; my $tmp = $name; $tmp =~ s/^\Q$root_path\E//; $r = $r->{$_} ||= {} for split m|/|, $tmp; #/ $dl{$name} ||= $r;
    Is something like: r is a ref to root hash. Then r is equal to the x key value of the root hash or an empty hash. this repeted for each part of the path splitted at separator. Then if dl_name is defined ok, if it is not defined take r. see Or, Or, Equals Zero, $x ||= 0

    I cannot see a meaning in this..but i'm tired now.

    Best wishes
    L*
    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
        Thank you -- yours and the other reply give me something to think about.
      Thank you -- your reply and the other reply to your post give me something to think and look at to see what I can figure out.
Re: Using File::Find to create a hash of directory structure.
by AnaximanderThales (Novice) on Dec 25, 2015 at 23:11 UTC
    Thank you Discipulus and the Anonymous Monk(s).

    I was at least able to comment the code as to what it was doing, and I only needed to make some slight modifications to my code to make it work as I Wanted it to work.

    I'll just drop in the subroutine and make notes directly on it.

    sub parse_tree { # note:make sure trailing slash is removed from passed arg my ($root_path) = @_; my %root; my %dl; my %count; # This is the wanted for find. my $path_checker = sub { # grab item name my $name = $File::Find::name; # If directory -- if (-d $name ) { # Assign current root hash as reference to r my $r = \%root; # hold the item in temp my $tmp = $name; # Remove the root path section of the file to get rid of the # first part of the path $tmp =~ s/^\Q$root_path\E//; # If a leading '/' is left, remove that. $tmp =~ s/^\///; # if the entry in the hash is undefined, create a new empty # hash from the splitted path of tmp. $r = $r->{$_} ||= {} for split m|/|, $tmp; #/ # if $dl{$name} is undefind, the assign $r to $dl{$name}. $dl{$name} ||= $r; } elsif (-f $name) { # If $name is a file, find the directory path my $dir = $File::Find::dir; # increment the file count based on that directory. my $key = "file". ++$count{ $dir }; # create a new entry in the hash $dl{$dir}{$key} = $_; } }; find($path_checker, $root_path); return \%root; }
Re: Using File::Find to create a hash of directory structure.
by Anonymous Monk on Dec 19, 2015 at 22:11 UTC
    and your question is?

      I'm sorry -- I thought my question was obvious.

      In my attempt to figure out how to create the hash of the directory structure, I'm not happy with how the hash is created. As I'm a novice to perl, I was wondering:

    • What do those two lines do?
    • How does the hash %root get populated?
    • What suggestions do you have to achieve my goal?

      In an attempt for brevity, the two lines in question, the code and output examples can be referenced in the original post.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1150770]
Approved by Discipulus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (4)
As of 2022-12-05 21:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?