Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Comparing Values PER Sub-folder

by omegaweaponZ (Beadle)
on Sep 04, 2012 at 20:34 UTC ( #991686=perlquestion: print w/ replies, xml ) Need Help??
omegaweaponZ has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, need some thoughts on this. I am looking to compare and contrast the number of file lines PER folder in each sub-directory on a given base directory. The compare and contrast method is easy enough handled by Tie::File, but how can I loop through each file to only compare it to EACH file in an entire single sub-directory set? For example:

A) Master Directory
AA) Subfolder 1
1) Sub Folder A
2) Sub Folder B
3) Sub Folder C
BB) Subfolder 2
1) Sub Folder A
2) Sub Folder B
3) Sub Folder C
etc...

I want to compare the lines of all files that end in a given extension (like .txt) in each sub folder A-C for each AA or BB subfolder ONLY. So if AA Subfolder 1 has 4 lines of text in each file per each sub-folder there, it is seperated from BB's subfolder comparision.

Hope that isn't confusing. Again I'm not concerned with grabbing the line numbers, that's easy enough through a count of an array, its breaking up each sub-folder into perhaps a while loop to separate it out that's difficult

Comment on Comparing Values PER Sub-folder
Re: Comparing Values PER Sub-folder
by Kenosis (Priest) on Sep 04, 2012 at 23:26 UTC

    If I correctly understand your goal, perhaps File::Find and File::Slurp will be helpful:

    use Modern::Perl; use File::Find; use File::Slurp qw/read_file/; my $startDir = '.'; find( { wanted => \&countLines, }, $startDir ); sub countLines { /\.txt$/ or return; my $completePath = $File::Find::name; my $curDir = $File::Find::dir; my $curFile = $_; my @fileLines = read_file $curFile; my $numLines = @fileLines; say "Cur dir: $curDir; Cur file: $curFile; Num Lines: $numLines"; }

    Partial output:

    Cur dir: ./test/test bbb; Cur file: B.txt; Num Lines: 6

    The script above will start at $startDir and descent into directories, processing only *.txt files. Consider using a hash (key = $curDir; val = totLines) to store totalFileLines per directory.

    Hope this helps!

      I think I see your logic, I can probably also just do an if/else to see if curdir matches to compare and contrast. I slightly modified the code to do this, but I'm getting no returns. Where am I going wrong? $dir is current working directory
      find(\&countLines, $dir); sub countLines { /\.txt$/ or return; my $completePath = $File::Find::name; my $curDir = $File::Find::dir; my $curFile = $_; tie my @filelines, 'Tie::File', $curFile or die; my $numLines = @filelines; print "Cur dir: $curDir; Cur file: $curFile; Num Lines: $numLines +\n"; }

        Let me first address another issue... Tie::File can be rather slow--especially when used on large files. This is why I used File::Slurp for the line count. Also, remember that you should untie the formerly tied array when done with it.

        I ran your subroutine, adding untie @filelines; before the end of the code block, and it executed just fine (it ran fine w/o that addition, too, but it's best to follow a tie with an untie). I'm unsure why you're not getting any output from the routine...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://991686]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (13)
As of 2015-07-06 19:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (81 votes), past polls