If I wanted to produce the following result
Header stuff
123456|987|12
Apples|9
Oranges|19
Bananas|4
Footer junk
Header stuff
123456|987|34
Apples|7
Oranges|15
Bananas|11
Footer junk
Header stuff
123456|987|56
Apples|3
Oranges|9
Bananas|8
Footer junk
from the two input files fake1.dat
Header stuff
123456|987|12
Apples|4
Oranges|12
Bananas|3
Footer junk
Header stuff
123456|987|34
Apples|5
Oranges|7
Bananas|8
Footer junk
Header stuff
123456|987|56
Apples|2
Oranges|1
Bananas|3
Footer junk
and fake2.dat
Header stuff
123456|987|12
Apples|5
Oranges|7
Bananas|1
Footer junk
Header stuff
123456|987|34
Apples|2
Oranges|8
Bananas|3
Footer junk
Header stuff
123456|987|56
Apples|1
Oranges|8
Bananas|5
Footer junk
I would probably write a script like this to do it:
#!/usr/bin/perl -w
use strict;
my %data;
{
# Go looking for files that match this pattern.
foreach my $thisFile (glob("fake?.dat")) {
# Open the file, and die if that doesn't work.
open ( INPUT, $thisFile ) or
die "Unable to open $thisFile: $!";
my ( $header, $id, @data, $footer );
while (<INPUT>) {
# Read in a line from the file. We're expecting
# a header, an ID line, followed by a bunch of
# lines of data, terminated by a footer. There
# can be several of these records in a file. For
# the sake of simplicity, we assume that the
# lines of data are always present and always in
# the same order.
chomp;
if ( defined ( $header ) ) {
if ( defined ( $id ) ) {
if ( /Footer/ ) {
# If we just saw a footer, that's the end
# of a record and we can process what we
# have now.
$footer = $_;
# The unique ID number is the last number
# on the ID line.
my ( $id3 ) = $id =~ m/\|(\d+)$/;
# Store this record's information into a
# hash, either re-using the existing hash
# element, or creating a new one.
if ( exists($data{ $id3 }) ) {
my @updatedData;
foreach ( @{$data{ $id3 }->{data}} ) {
my @dataSoFar = split(/\|/, $_);
my @thisData = split(/\|/,shift @data);
$dataSoFar[1] += $thisData[1];
push ( @updatedData, join('|', @dataSoFar) );
}
$data{ $id3 }->{data} = \@updatedData;
} else {
$data{ $id3 }->{header} = $header;
$data{ $id3 }->{id} = $id;
push ( @{$data{ $id3 }->{data}}, @data );
$data{ $id3 }->{footer} = $footer;
}
# Clear variables for next loop around the
# input file.
undef $header;
undef $id;
@data = ();
undef $footer;
} else {
push ( @data, $_ );
}
} else {
$id = $_;
}
} else {
$header = $_;
}
}
close ( INPUT );
}
# Having added up the various lines of data, we now
# dump out a summary.
foreach my $thisKey ( sort keys %data ) {
print "$data{ $thisKey }->{'header'}\n";
print "$data{ $thisKey }->{'id'}\n";
foreach ( @{$data{ $thisKey }->{'data'}} ) {
print "$_\n";
}
print "$data{ $thisKey }->{'footer'}\n";
}
}
See if that helps you.
Alex / talexb / Toronto
"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds