Although Perlmonks is not a code writing service, sometimes it is just easier to explain what you have to do by simply writing the code, especially when your requirements are not entirely clear.I understand it that you have one large file with data in a certain format which you have to parse in order to get some kind of summarized results, i.e. the number of passed and failed "Case-URL" and "Req-URL" items. Everytime you hear "large file" you should think of reading/parsing/handling the file on a record-by-record basis. That will minimize your memory requirements. It also means that you have to determine the record format, more especially, the record delimiter. Sometimes the record delimiter can be as simple as a CR/LF, sometimes it is longer. In this case it is "__________________________________________________________\n" What you have to do is to read the file record-by-record. Fortunately Perl can do that easily: all you have to do is tell Perl what is the record delimiter and assign that to the $/ variable. Then you can read the file a record at a time and through the magic of regular expressions extract the data you need and update the variables with the count of the data found. The following is one of the ways to do this: #
use Modern::Perl;
use Data::Dump qw/dump/;
local $/ = "__________________________________________________________
+\n";
my %results;
while ( my $record = <DATA> ) {
my $pass = $record =~ m/\Q***Passed***\E/ ? 'Passed' : 'Failed';
for my $line ( split /\n/, $record ) {
next unless $line =~ /^\[/;
my ( $case_req, $url ) = split /\s+-\s+/, $line;
$results{$pass}{$case_req}{$url}++;
}
}
say dump(%results);
__DATA__
Execution start time 09/13/2013 02:43:55 pm
[Case-Url] - www.google.com
[Req-URL ] - www.qtp.com
***Passed***
__________________________________________________________
[Case-Url] - www.yahoo.com
[Req-URL ] - www.msn.com
***Passed***
__________________________________________________________
[Case-Url] - www.google.com
[Req-URL ] - www.qtp.com
***Failed***
Output:(
"Passed",
{
"[Case-Url]" => { "www.google.com" => 1, "www.yahoo.com" => 1 },
"[Req-URL ]" => { "www.msn.com" => 1, "www.qtp.com" => 1 },
},
"Failed",
{
"[Case-Url]" => { "www.google.com" => 1 },
"[Req-URL ]" => { "www.qtp.com" => 1 },
},
)
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics
|