Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Comparing Values PER Sub-folder

by omegaweaponZ (Beadle)
on Sep 04, 2012 at 20:34 UTC ( #991686=perlquestion: print w/replies, xml ) Need Help??
omegaweaponZ has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, need some thoughts on this. I am looking to compare and contrast the number of file lines PER folder in each sub-directory on a given base directory. The compare and contrast method is easy enough handled by Tie::File, but how can I loop through each file to only compare it to EACH file in an entire single sub-directory set? For example:

A) Master Directory
AA) Subfolder 1
1) Sub Folder A
2) Sub Folder B
3) Sub Folder C
BB) Subfolder 2
1) Sub Folder A
2) Sub Folder B
3) Sub Folder C

I want to compare the lines of all files that end in a given extension (like .txt) in each sub folder A-C for each AA or BB subfolder ONLY. So if AA Subfolder 1 has 4 lines of text in each file per each sub-folder there, it is seperated from BB's subfolder comparision.

Hope that isn't confusing. Again I'm not concerned with grabbing the line numbers, that's easy enough through a count of an array, its breaking up each sub-folder into perhaps a while loop to separate it out that's difficult

Replies are listed 'Best First'.
Re: Comparing Values PER Sub-folder
by Kenosis (Priest) on Sep 04, 2012 at 23:26 UTC

    If I correctly understand your goal, perhaps File::Find and File::Slurp will be helpful:

    use Modern::Perl; use File::Find; use File::Slurp qw/read_file/; my $startDir = '.'; find( { wanted => \&countLines, }, $startDir ); sub countLines { /\.txt$/ or return; my $completePath = $File::Find::name; my $curDir = $File::Find::dir; my $curFile = $_; my @fileLines = read_file $curFile; my $numLines = @fileLines; say "Cur dir: $curDir; Cur file: $curFile; Num Lines: $numLines"; }

    Partial output:

    Cur dir: ./test/test bbb; Cur file: B.txt; Num Lines: 6

    The script above will start at $startDir and descent into directories, processing only *.txt files. Consider using a hash (key = $curDir; val = totLines) to store totalFileLines per directory.

    Hope this helps!

      I think I see your logic, I can probably also just do an if/else to see if curdir matches to compare and contrast. I slightly modified the code to do this, but I'm getting no returns. Where am I going wrong? $dir is current working directory
      find(\&countLines, $dir); sub countLines { /\.txt$/ or return; my $completePath = $File::Find::name; my $curDir = $File::Find::dir; my $curFile = $_; tie my @filelines, 'Tie::File', $curFile or die; my $numLines = @filelines; print "Cur dir: $curDir; Cur file: $curFile; Num Lines: $numLines +\n"; }

        Let me first address another issue... Tie::File can be rather slow--especially when used on large files. This is why I used File::Slurp for the line count. Also, remember that you should untie the formerly tied array when done with it.

        I ran your subroutine, adding untie @filelines; before the end of the code block, and it executed just fine (it ran fine w/o that addition, too, but it's best to follow a tie with an untie). I'm unsure why you're not getting any output from the routine...

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://991686]
Approved by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2019-01-16 03:50 GMT
Find Nodes?
    Voting Booth?
    After Perl5, I'm mostly interested in:

    Results (274 votes). Check out past polls.