It is recommended that we use the three arg use of open, but we should also check for failure:
open my $firstfile, '<', 'DFLOG1.txt'
or die "Can't open file $firstfile: $!";
Nice work on the code beyond my nit :) | [reply] [Watch: Dir/Any] [d/l] |
Thank you for your reply and code modification. As I said I am new to Perl and am not familiar with some of the code you provided.
I am including a sample interval of the files I am evaluating.
First file:
00:00:01
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/hd4 262144 102228 159916 39% /
/dev/hd2 2359296 2243384 115912 96% /usr
/dev/hd9var 1048576 199528 849048 20% /var
/dev/hd3 1048576 8240 1040336 1% /tmp
/dev/hd1 2097152 41140 2056012 2% /home
/proc - - - - /proc
/dev/hd10opt 524288 206640 317648 40% /opt
/dev/ts1000 262144 716 261428 1% /usr/local
/dev/ts1001 6291456 5967256 324200 95% /banktools
/dev/ts1002 786432 448 785984 1% /stage
/dev/ap1001 20971520 11231912 9739608 54% /oracle
/dev/ap1002 36700160 11310284 25389876 31% /ora01/oradata
/dev/ap1003 36700160 13372456 23327704 37% /ora02/oradata
/dev/ap1004 31457280 5995712 25461568 20% /ora03/oradata
/dev/ap1005 20971520 11067000 9904520 53% /ora04/oradata
/dev/ap1006 26214400 23716476 2497924 91% /ora05/oradata
/dev/ap1007 26214400 15031656 11182744 58% /ora06/oradata
/dev/ap1008 20971520 17307236 3664284 83% /ora01/orabkup
/dev/ap1009 209715200 35552472 174162728 17% /ora01/oraflash
Second file:
23:00:00
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/hd4 262144 102484 159660 40% /
/dev/hd2 2359296 2243384 115912 96% /usr
/dev/hd9var 1048576 197148 851428 19% /var
/dev/hd3 1048576 8928 1039648 1% /tmp
/dev/hd1 2097152 40956 2056196 2% /home
/proc - - - - /proc
/dev/hd10opt 524288 206820 317468 40% /opt
/dev/ts1000 262144 716 261428 1% /usr/local
/dev/ts1001 6291456 6093220 198236 97% /banktools
/dev/ts1002 786432 448 785984 1% /stage
/dev/ap1001 20971520 11312864 9658656 54% /oracle
/dev/ap1002 36700160 11310284 25389876 31% /ora01/oradata
/dev/ap1003 36700160 13372456 23327704 37% /ora02/oradata
/dev/ap1004 31457280 5995712 25461568 20% /ora03/oradata
/dev/ap1005 20971520 11067000 9904520 53% /ora04/oradata
/dev/ap1006 26214400 23716476 2497924 91% /ora05/oradata
/dev/ap1007 26214400 15031656 11182744 58% /ora06/oradata
/dev/ap1008 20971520 17307236 3664284 83% /ora01/orabkup
/dev/ap1009 209715200 35552472 174162728 17% /ora01/oraflash
You guessed correctly in understanding my desire to read each line of both files and compare the matching mount points and produce a single line of output to a .csv file for each mount point, that includes the total, used and free disk for both the beginning file and end file.
I have a question regarding the subroutine consume that you created. Does it read in a single line of the first file and perform the formatting before reading in a single line of the second file, or are all lines of the first file read and formatted before the second file is called?
The reason I ask is due to the fact that my desired output is a single line of data for each mount point listed that would include the total, used and free space of each file.
Sorry if my question is confusing. I have only been using Perl for two weeks and have been self taught.
Thanks for your cooperation.
| [reply] [Watch: Dir/Any] [d/l] [select] |
consume() shlorps up the whole file and builds a hash out of it.
Each entry in the hash is an independent, nameless array that contains the appropriate data for each line.
Now you have two "phone books" of names of filesystems, each of which uses the same names (such as '/dev/hd4', etc.). So that means you can use that name to pull out the relevant statistics for each of the two files. Let me see if I can make this simpler: instead of using the anonymous arrays, let's use a nested hash. If you were writing all this down on paper, you might make a table for each machine that had the filesystem names as rows, and the fields (total, used, and free) and the columns.
To do something like that in Perl, we'd rewrite consume() to do this instead:
sub consume {
my ($filehandle) = @_;
my %result;
while ( defined($_ = <$filehandle>) ) {
chomp;
next unless /dev/;
($mount_point, $total_space, $used_space, $free_space) = split
+;
$result{$mount_point}{total} = $total_space;
$result{$mount_point}{used} = $used_space;
$result{$mount_point}{free)} = $free_space;
}
return %result;
}
See how we set that up? The mount point looks up a place in the hash that contains another hash nested inside it, and we use the words 'total', 'used', and 'free' to store the relevant numbers in that nested hash. So now your calculations if (say) you wanted to list the differences would look like this:
# Assume first machine is the more important one and we want to be
+ sure we check
# all its filesystems. (We can't guarantee we looked at all of the
+ second machine's
# filesystems because this just uses the same keys as machine 1 to
+ look at machine 2.
# There might be more filesystems with different names.)
my @unmatched;
# Section 1: matched on both.
foreach my $filesystem_name (sort keys %first_machine) {
print "$filesystem_name: ";
my $has_differences;
foreach my $type (qw(total free used)) {
if (exists $second_machine{$filesystem_name}) {
# Filesystem mounted on both machines
my $difference = $first_machine{$filesystem_name}{$typ
+e} - $second_machine{$filesystem_name}{$type};
if ($difference) {
print "$type: $difference ";
$has_differences = 1;
}
print "\n"; # finish the line and output it
delete $second_machine{$filesystem_name};
}
else {
push @unmatched, "$filesystem_name: ";
foreach my $kind (qw(total free used)) {
$unmatched[-1] .= $first_machine{$unmatched}{$kind
+} . " ";
}
$unmatched[-1] .= "\n";
}
}
# Second section: unmatched on first machine.
if (@unmatched) {
print @unmatched;
}
# Third section: unmatched on second machine.
if (keys %second_machine) {
# unprocessed filesystems on 2 not on 1.
print "Unmatched filesystems on machine 2:\n";
foreach my $unmatched (sort keys %second_machine) {
print "$unmatched ";
foreach my $kind (qw(total free used)) {
print $second_machine{$unmatched}{$kind}, " ";
}
print "]\n";
}
}
The first section looks for items in the second table that match the ones in the first, and prints the comparison between the two. Note that delete() in there: that throws away items in the second hash that we've already processed (we could add a 'matched' field to the hash if it was particularly expensive to re-create the items, but that's not the case here). If we dont find a match, we concatenate the record back together and add it to the @unmatched array, all ready to print.
We check that array after we finish the pass over the first machine's filesystem to see if we had any unmatched machine 1 filesystems, and just print them all if there are any.
When we get to the third loop, anything that matched the first system that was in the second system's table has been dropped, so if there's anything left, that means it's something not matched on the second machine. We format and print those as well.
Any other kinds of analysis fall into your balliwick rather than mine, but that should provide you with a starting point. I switched the implementation because the anonymous arrays are a little harder to understand if you're just getting started. | [reply] [Watch: Dir/Any] [d/l] [select] |