"I am thinking of populating data directly into xml_data from each DB1/DB2/DIFF loop."
I suspect that's probably the best option. You'll get some duplicate assignments but I don't imagine that will be a problem. Getting around that would likely involve setting and testing flags that indicate whether a particular assignment has occurred. Benchmark if you feel it's important.
I see you've added some realistic data. Below, you'll see I've added some more to simulate these new scenarios you've described ("... only in DB1 or DB2 ..."). Where you have data like "A B C", I've plugged the spaces with underscores (i.e. "A_B_C"); this was just so I didn't have to rework how I was handling the data with Inline::Files and split: I'm assuming you're already getting your resultsets as array data. One thing I noticed was that I'm getting <KEY>A_B_C</KEY> while your output is showing <KEY>'A B C'</KEY>. The start and end tags act as delimiters so additional quotes aren't generally necessary in XML; they could potentially cause issues like the data being converted to 'A B C': my instinct would be to remove the quotes before populating the XML — you may have a valid reason for leaving them in.
Here's the updated code (includes handling the <null> field):
#!/usr/bin/env perl
use strict;
use warnings;
use Inline::Files;
use XML::Simple qw{:strict};
my %xml_hash = (Data => {});
my $xml_data = $xml_hash{Data};
my (%db1, %db2);
while (<DB1>) {
my ($root, $table, $key) = split;
push @{$db1{$root}{$table}}, $key;
$xml_data->{$root}{$table}{NEW1}{KEY} = [$key];
}
while (<DB2>) {
my ($root, $table, $key) = split;
push @{$db2{$root}{$table}}, $key;
$xml_data->{$root}{$table}{NEW2}{KEY} = [$key];
}
while (<DIFF>) {
my ($root, $table, $key, $col, $old, $new)
= map { $_ eq '<null>' ? '' : $_ } split;
$xml_data->{$root}{$table}{NEW1}{KEY} = $db1{$root}{$table}
if exists $db1{$root}{$table};
$xml_data->{$root}{$table}{NEW2}{KEY} = $db2{$root}{$table}
if exists $db2{$root}{$table};
$xml_data->{$root}{$table}{MODIFIED}{KEY}{$key}{$col}{oldvalue} =
+[$old];
$xml_data->{$root}{$table}{MODIFIED}{KEY}{$key}{$col}{newvalue} =
+[$new];
}
print XMLout(\%xml_hash, KeepRoot => 1, KeyAttr => {KEY => 'name'});
__DIFF__
EMPLOYEE EMPLOYEE XYZ TITLE <null> Mr
EMPLOYEE EMPDETAILS DEF CITY California New York
CUSTOMER CUSTOMER ABC CAPTION Regular Premium
__DB1__
EMPLOYEE EMPLOYEE NEW_EMPLOYEE_1
EMPLOYEE EMPLOYEE NEW_EMPLOYEE_9
EMPLOYEE EMPDETAILS NEW_EMPLOYEE1-DETAILS
EMPLOYEE EMPDETAILS NEW_EMPLOYEE9-DETAILS
EMPLOYEE EMPDETAILS NEW_EMPLOYEE16-DETAILS
IN_DB1_ONLY IN_DB1_ONLY IN_DB1_ONLY
IN_DB1+DB2 IN_DB1+DB2 IN_DB1+DB2
__DB2__
EMPLOYEE EMPLOYEE NEW_EMPLOYEE_6
EMPLOYEE EMPDETAILS NEW_EMPLOYEE6-DETAILS
CUSTOMER CUSTOMER NEW_CUSTOMER
IN_DB2_ONLY IN_DB2_ONLY IN_DB2_ONLY
IN_DB1+DB2 IN_DB1+DB2 IN_DB1+DB2
Output:
$ pm_xml_db_diff2.pl
<Data>
<CUSTOMER>
<CUSTOMER>
<MODIFIED>
<KEY name="ABC">
<CAPTION>
<newvalue>Premium</newvalue>
<oldvalue>Regular</oldvalue>
</CAPTION>
</KEY>
</MODIFIED>
<NEW2>
<KEY>NEW_CUSTOMER</KEY>
</NEW2>
</CUSTOMER>
</CUSTOMER>
<EMPLOYEE>
<EMPDETAILS>
<MODIFIED>
<KEY name="DEF">
<CITY>
<newvalue>New</newvalue>
<oldvalue>California</oldvalue>
</CITY>
</KEY>
</MODIFIED>
<NEW1>
<KEY>NEW_EMPLOYEE1-DETAILS</KEY>
<KEY>NEW_EMPLOYEE9-DETAILS</KEY>
<KEY>NEW_EMPLOYEE16-DETAILS</KEY>
</NEW1>
<NEW2>
<KEY>NEW_EMPLOYEE6-DETAILS</KEY>
</NEW2>
</EMPDETAILS>
<EMPLOYEE>
<MODIFIED>
<KEY name="XYZ">
<TITLE>
<newvalue>Mr</newvalue>
<oldvalue></oldvalue>
</TITLE>
</KEY>
</MODIFIED>
<NEW1>
<KEY>NEW_EMPLOYEE_1</KEY>
<KEY>NEW_EMPLOYEE_9</KEY>
</NEW1>
<NEW2>
<KEY>NEW_EMPLOYEE_6</KEY>
</NEW2>
</EMPLOYEE>
</EMPLOYEE>
<IN_DB1+DB2>
<IN_DB1+DB2>
<NEW1>
<KEY>IN_DB1+DB2</KEY>
</NEW1>
<NEW2>
<KEY>IN_DB1+DB2</KEY>
</NEW2>
</IN_DB1+DB2>
</IN_DB1+DB2>
<IN_DB1_ONLY>
<IN_DB1_ONLY>
<NEW1>
<KEY>IN_DB1_ONLY</KEY>
</NEW1>
</IN_DB1_ONLY>
</IN_DB1_ONLY>
<IN_DB2_ONLY>
<IN_DB2_ONLY>
<NEW2>
<KEY>IN_DB2_ONLY</KEY>
</NEW2>
</IN_DB2_ONLY>
</IN_DB2_ONLY>
</Data>
|