2011-04-28 13:25:47 INFO [main:114] <Message><Tag attribute="value">An
+swer</Tag></Message>
2011-04-28 13:45:12 DEBUG [Populate::List:31] <Message><Tag attribute=
+"value">Answer</Tag></Message>
In other words, a Log4J standard log where the log entry is an XML document. I am parsing the log similar to the code below:
while (<$fh>) {
chomp;
my ($date, $time, $log_lvl, $trace, $xml) = split ' ', $_, 5;
}
For each XML document, I need to convert it to a perl data structure and do something with it. That would look something like:
my $twig = XML::Twig->new();
while (<$fh>) {
chomp;
my ($date, $time, $log_lvl, $trace, $xml) = split ' ', $_, 5;
my %data_structure;
$twig->parse($xml);
# Build up %data_structure using $twig
}
I could easily change this code to be "elegant" as such:
while (<$fh>) {
chomp;
my ($date, $time, $log_lvl, $trace, $xml) = split ' ', $_, 5;
my $data_structure = extract_data($xml);
}
sub extract_data {
my ($xml) = @_;
my $data = {};
my $twig = XML::Twig->new(
twig_handlers => {
Message => sub { handle_message(@_, $data) }
}
);
$twig->parse($xml);
return $data;
}
sub handle_message {
# ...
}
There is absolutely nothing wrong with this and I haven't profiled it to see that it isn't fast enough but that is my concern. I would like to inline as much as possible. So now that I have laid it out there I realize if it were someone else asking this question I would tell them to quit being falsely lazy, write it in a clear maintainable way and profile it and only worry about performance if it was unacceptable.
|