vagabonding electron has asked for the wisdom of the Perl Monks concerning the following question:
Dear Monks,
I use XML::Rules to parse a huge XML document. As a Perl amateur I'm generally happy with the module since it does what I mean. Now however I'm stuck with the following problem.
Here is the XML chunk.
<Outpatient_Services>
<Outpatient_Service>
<Outpatient_Clinic>
<AM_Key>AM01</AM_Key>
<Description>Description of the Outpatient_Clinic</Description>
<Explanations>Explanations to the Outpatient_Clinic</Explanations>
</Outpatient_Clinic>
</Outpatient_Service>
<Outpatient_Service>
<Outpatient_Clinic>
<AM_Key>AM01</AM_Key>
<Description>Description of the Outpatient_Clinic</Description>
<Explanations>Explanations to the Outpatient_Clinic</Explanations>
</Outpatient_Clinic>
</Outpatient_Service>
<Outpatient_Service>
<Outpatient_Clinic>
<AM_Key>AM02</AM_Key>
<Description>Description of the Outpatient_Clinic</Description>
<Capacities_Outpatient_Clinic>
<Care_Point>
<VA_VU_Key_Outpatient_Clinic>VA01</VA_VU_Key_Outpatient_Clinic>
</Care_Point>
<Care_Point>
<Other>
<VA_VU_Other_Key_Outpatient_Clinic>VA00</VA_VU_Other_Key_Outpatient_Cl
+inic>
<Description>Other Care Point of the Outpatient Clinic</Description>
</Other>
</Care_Point>
</Capacities_Outpatient_Clinic>
</Outpatient_Clinic>
</Outpatient_Service>
<Outpatient_Service>
<Outpatient_Clinic_Special>
<AM_Special_Key>AM06</AM_Special_Key>
<Description>Description of the Outpatient_Clinic</Description>
<Capacities_Outpatient_Clinic_Special>
<Capacity>
<LK_Key>LK01</LK_Key>
</Capacity>
<Capacity>
<LK_Key>LK02</LK_Key>
</Capacity>
</Capacities_Outpatient_Clinic_Special>
<Explanations>Explanations to the Outpatient_Clinic</Explanations>
</Outpatient_Clinic_Special>
</Outpatient_Service>
<Outpatient_Service>
<Outpatient_Clinic>
<AM_Key>AM04</AM_Key>
</Outpatient_Clinic>
</Outpatient_Service>
<Outpatient_Service>
<Outpatient_Clinic>
<Other>
<AM_Other_Key>AM00</AM_Other_Key>
<Type>Type of the other Outpatient Clinic</Type>
</Other>
<Description>Description of the Outpatient Clinic</Description>
<Capacities_Outpatient_Clinic>
<Care_Point>
<VA_VU_Key_Outpatient_Clinic>VA02</VA_VU_Key_Outpatient_Clinic>
</Care_Point>
<Care_Point>
<Other>
<VA_VU_Other_Key_Outpatient_Clinic>VA00</VA_VU_Other_Key_Outpatient_Cl
+inic>
<Description>Other Care Point of the Outpatient Clinic</Description>
</Other>
</Care_Point>
</Capacities_Outpatient_Clinic>
<Explanations>Explanations to the Outpatient_Clinic</Explanations>
</Outpatient_Clinic>
</Outpatient_Service>
</Outpatient_Services>
Update:The desired output presentation was not correct in the original, you can see it in the spoiler below.
The output ought to be the following:
- Outpatient_Services:
AM11: 1
AM07: 1
AM04: 1
However I need this not only for the AM-Keys but also for the LK-Keys and that is what I cannot achieve since there are two of them, not one.
The following code chunk is a way out so that I at least get an info about the "Outpatient_Clinic_Special", but it is just a description, not the keys.
'Outpatient_Service' => sub {
if (exists $_[1]->{Outpatient_Clinic})
{
if ( exists $_[1]->{Outpatient_Clinic}->{Other} )
{
return $_[1]->{Outpatient_Clinic}->{Description} => 1
}
else
{
return $_[1]->{Outpatient_Clinic}->{AM_Key} => 1
}
}
elsif ( exists $_[1]->{Outpatient_Clinic_Special} )
{
return $_[1]->{Outpatient_Clinic_Special}->{Description} => 1;
+ # as a way out.
}
else
{ }
},
Could you please give me a hint?
Thanks in advance!
VE
Re: How to return two and more values by parsing XML with XML::Rules?
by runrig (Abbot) on Nov 06, 2012 at 23:01 UTC
|
Sometimes its simpler to use variables in an outer scope than to go through contortions: my (@am_keys, @lk_keys);
my @rules = (
AM_Key => sub { push @am_keys, $_[1]{_content}; return },
LK_Key => sub { push @lk_keys, $_[1]{_content}; return },
_default => undef,
);
my $xr = XML::Rules->new( rules => \@rules );
| [reply] [Watch: Dir/Any] [d/l] |
|
my @rules = (
AM_Key => sub { push @{$_[4]->{pad}{am_keys}}, $_[1]{_content}; retu
+rn },
LK_Key => sub { push @{$_[4]->{pad}{lk_keys}}, $_[1]{_content}; retu
+rn },
Outpatient_Services => sub { LK_Keys => $_[4]->{pad}{lk_keys}, AM_Ke
+ys => $_[4]->{pad}{am_keys}},
_default => undef,
);
my $xr = XML::Rules->new( rules => \@rules );
With this $xr->parse() returns a HoA containing the array of the AM_Keys and the array of LK_Keys.
Jenda
Enoch was right!
Enjoy the last years of Rome.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Thank you very much for this!
| [reply] [Watch: Dir/Any] |
Re: How to return two and more values by parsing XML with XML::Rules?
by Anonymous Monk on Nov 06, 2012 at 10:11 UTC
|
Your sample xml doesn't match your wanted data -- irritating | [reply] [Watch: Dir/Any] |
|
Sorry, I did not mention that I have run
'Capacities_Outpatient_Clinic_Special' => 'pass',
before the mentioned chunk of the script. Thank you for pointing that out.
Apart from that this is a fragment of a script that actually runs (only with fields translated into English). The above rule should not affect the code fragment however, perhaps I overook another mismatch?
Update It must be the result presentation that has to be adjusted in my fragment - it should be:
- Outpatient_Services:
AM11: 1
AM07: 1
AM04: 1
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
#!/usr/bin/perl --
use strict; use warnings;
use XML::Rules;
use Data::Dump qw/ dd /;
my $ta = XML::Rules->new(
qw/ stripspaces 8 /,
rules => {
'Outpatient_Services' => 'no content',
'Outpatient_Service' => 'as array no content',
#~ 'Outpatient_Clinic' => 'content by AM_Key',
'Outpatient_Clinic' => sub {
#~ $rule->( $tag_name, \%attrs, \@context, \@parent_data, $parser)
#~ my ($tagname, $attrHash, $contexArray, $parentDataArray, $parser) =
+ @_;
my $amk = $_[1]->{AM_Key} ;
return unless $amk;
{ $amk => 1 };
},
#~ _default => sub { $_[0] => $_[1]->{_content} },
_default => 'content',
'Outpatient_Clinic_Special' => undef,
},
);
my $ref = $ta->parsefile( 'pm1002448.xml' );
dd $ref;
use YAML(); print YAML::Dump( $ref);
__END__
{
Outpatient_Services => {
Outpatient_Service => [
{ AM01 => 1 },
{ AM01 => 1 },
{ AM02 => 1 },
{},
{ AM04 => 1 },
{},
],
},
}
---
Outpatient_Services:
Outpatient_Service:
- AM01: 1
- AM01: 1
- AM02: 1
- {}
- AM04: 1
- {}
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
|
|
Re: How to return two and more values by parsing XML with XML::Rules?
by Anonymous Monk on Nov 06, 2012 at 11:37 UTC
|
{
use strict; use warnings;
use Data::Dump qw/ dd /;
use XML::Twig ;
my( %os, @amk );
XML::Twig->new(
twig_handlers => {
#~ '/Outpatient_Services/Outpatient_Service/Outpatient_Clinic/
+AM_Key' => sub {
'AM_Key' => sub {
print $_->xpath, "\n";
push @amk, $_->trimmed_text;
},
'Outpatient_Service' => sub {
print $_->xpath, "\n";
$os{ shift @amk }++ while @amk;
},
},
)->xparse( 'pm1002448.xml' );
my $ref = {
Outpatient_Services => \%os,
};
dd $ref; use YAML(); print YAML::Dump( $ref);
}
__END__
/Outpatient_Services/Outpatient_Service/Outpatient_Clinic/AM_Key
/Outpatient_Services/Outpatient_Service
/Outpatient_Services/Outpatient_Service[2]/Outpatient_Clinic/AM_Key
/Outpatient_Services/Outpatient_Service[2]
/Outpatient_Services/Outpatient_Service[3]/Outpatient_Clinic/AM_Key
/Outpatient_Services/Outpatient_Service[3]
/Outpatient_Services/Outpatient_Service[4]
/Outpatient_Services/Outpatient_Service[5]/Outpatient_Clinic/AM_Key
/Outpatient_Services/Outpatient_Service[5]
/Outpatient_Services/Outpatient_Service[6]
{ Outpatient_Services => { AM01 => 2, AM02 => 1, AM04 => 1 } }
---
Outpatient_Services:
AM01: 2
AM02: 1
AM04: 1
| [reply] [Watch: Dir/Any] [d/l] |
|
Wow thanks!
It works with LK_Keys as well:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dump qw/ dd /;
use XML::Twig ;
my( %os, @amk );
XML::Twig->new(
twig_handlers => {
'AM_Key' => sub {
# print $_->xpath, "\n";
push @amk, $_->trimmed_text;
},
'LK_Key' => sub {
# print $_->xpath, "\n";
push @amk, $_->trimmed_text;
},
'Outpatient_Service' => sub {
# print $_->xpath, "\n";
$os{ shift @amk }++ while @amk;
},
},
)->xparse( shift );
my $ref = {
Outpatient_Services => \%os,
};
# dd $ref;
use YAML::XS();
print YAML::XS::Dump( $ref);
prints:
Outpatient_Services:
AM01: 2
AM02: 1
AM04: 1
LK01: 1
LK02: 1
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: How to return two and more values by parsing XML with XML::Rules?
by vagabonding electron (Curate) on Nov 06, 2012 at 14:31 UTC
|
After some thoughts and readings I was finally able to produce the desired output with XML::Rules.
#!/usr/bin/perl
use strict;
use warnings;
use XML::Rules;
use YAML::XS;
my $parser = XML::Rules->new(
rules => {
'Capacities_Outpatient_Clinic,
Other,
Outpatient_Clinic,
Outpatient_Clinic_Special,
Outpatient_Services'
=> 'no content',
'Capacities_Outpatient_Clinic_Special' => 'pass',
'AM_Key,
AM_Other_Key,
AM_Special_Key,
Description,
Explanations,
LK_Key,
Type,
VA_VU_Key_Outpatient_Clinic,
VA_VU_Other_Key_Outpatient_Clinic'
=> 'content',
'Care_Point'
=> 'as array no content',
'Capacity' => sub {$_[1]->{LK_Key} => 1},
'Outpatient_Service' => sub {
if (exists $_[1]->{Outpatient_Clinic})
{
if ( exists $_[1]->{Outpatient_Clinic}->{Other} )
{
return $_[1]->{Outpatient_Clinic}->{Description} => 1
}
else
{
return $_[1]->{Outpatient_Clinic}->{AM_Key} => 1
}
}
elsif ( exists $_[1]->{Outpatient_Clinic_Special} )
{
my $h;
for ( keys %{ $_[1]->{Outpatient_Clinic_Special} } )
{
$h->{$_} = 1 if /LK\d*/;
}
return %$h;
}
else
{ }
},
}
);
my $data = $parser->parsefile(shift);
print Dump $data;
which prints
---
Outpatient_Services:
AM01: 1
AM02: 1
AM04: 1
Description of the Outpatient Clinic: 1
LK01: 1
LK02: 1
from the posted xml fragment.
What I still do not know is whether this is a proper use of the module or a side way. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
|