Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Reading multi-level-tag XML file

by CSharma (Sexton)
on Jul 27, 2015 at 14:30 UTC ( [id://1136471]=perlquestion: print w/replies, xml ) Need Help??

CSharma has asked for the wisdom of the Perl Monks concerning the following question:

Hi PerlMonks, I'm new to perl, please help! I've XML file having tags like below and I'm using XML::Simple perl module.
<?xml version="1.0" encoding="UTF-8"?><DataFeed recordCount="1377"> <SellerInformation><Seller sellerIdFromProvider="527543">BarketShop.co +m</Seller><TaxableLocationsCollection><TaxableLocation locationType=" +state" locationValue="FL">1</TaxableLocation></TaxableLocationsCollec +tion><ShippingChargesCollection><ShippingCharge type="fixed"/></Shipp +ingChargesCollection></SellerInformation> <SellerInformation><Seller sellerIdFromProvider="452471">Global Indust +rial</Seller> <TaxableLocationsCollection> <TaxableLocation locationType="state" locationValue="GA">1</TaxableLoc +ation> <TaxableLocation locationType="state" locationValue="NV">1</TaxableLoc +ation> <TaxableLocation locationType="state" locationValue="NJ">1</TaxableLoc +ation> <TaxableLocation locationType="state" locationValue="NY">1</TaxableLoc +ation></TaxableLocationsCollection> <ShippingChargesCollection> <ShippingCharge type="fixed"/> </ShippingChargesCollection> </SellerInformation></DataFeed>
My requirement is to get sellerIdFromProvider and value of each locationValue i.e. sellerIdFromProvider=> 452471 and locationValue => GA, NV, NJ, NY. Using below code I'm able to get values for the first *SellerInformation* tag (Single value so can be accessed via hash) but not for the 2nd one as that has multiple TaxableLocation tags (Would be accessed as an array). So I need to write a code which can deal both the conditions. Also I've to add comma seperated locationValue to another xml file (Which will be compared using sellerIdFromProvider. I think hashes would be used. I hope I'm clear! Please help me out. Thanks CSharma
#!/usr/bin/perl use XML::Simple; use Data::Dumper; $xml = new XML::Simple; foreach $d (@{$data->{'SellerInformation'}}) { print $d->{'Seller'}{'sellerIdFromProvider'} . "\n"; print $d->{'TaxableLocationsCollection'}{'TaxableLocation'}{'locationV +alue'} . "\n"; }

Replies are listed 'Best First'.
Re: Reading multi-level-tag XML file
by choroba (Cardinal) on Jul 27, 2015 at 14:48 UTC
    Now you know why XML::Simple says
    The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces. In particular, XML::LibXML is highly recommended.

    You can solve your issues by using the ForceArray option to XMLin:

    my $data = XMLin($xml_string, ForceArray => 1); for my $inf (@{ $data->{SellerInformation} }) { print $inf->{Seller}[0]{sellerIdFromProvider}, "\n"; for my $location (@{ $inf->{TaxableLocationsCollection}[0]{Taxable +Location} }) { print "\t", $location->{locationValue}, "\n"; } }
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      Thanks Choroba! That worked. I want output like below, but getting unexpected results. All location values of all tags are coming for all seller entries. Something here is messy, Could you please help? sellerid comma seperated locations i.e. 426089 NY, CA, FL
      for my $inf (@{$data->{'SellerInformation'}}) { $sellerid = $inf->{'Seller'}[0]{'sellerIdFromProvider'}; $i = 0; for my $loc (@{$inf->{'TaxableLocationsCollection'}[0]{'TaxableLocatio +n'}}) { $location[$i++] = $loc->{'locationValue'}; } $location{$sellerid} = join(", ", @location); print $sellerid, "\t" . $location{$sellerid} . "\n"; }

        You are not clearing the @location array for each Seller. Try

        for my $inf (@{$data->{'SellerInformation'}}) { my $sellerid = $inf->{'Seller'}[0]{'sellerIdFromProvider'}; my @location=(); for my $loc (@{$inf->{'TaxableLocationsCollection'}[0]{'TaxableLocat +ion'}}) { push @location,$loc->{'locationValue'}; } print $sellerid."\t" .join(", ", @location)."\n"; }
        poj
Re: Reading multi-level-tag XML file
by poj (Abbot) on Jul 27, 2015 at 15:29 UTC
    Using XML::Twig
    #!perl use strict; use warnings; use XML::Twig; use Data::Dump 'pp'; my %hash=(); my $twig = new XML::Twig( twig_handlers =>{ 'SellerInformation' => \&info } ); $twig->parsefile('test.xml'); pp \%hash; sub info { my ($t,$e) = @_; my $id = $e->first_child("Seller")->att('sellerIdFromProvider'); my $col = $e->first_child("TaxableLocationsCollection"); for my $loc ($col->descendants("TaxableLocation")){ push @{$hash{$id}}, $loc->att('locationValue'); } }
    poj
Re: Reading multi-level-tag XML file
by choroba (Cardinal) on Jul 27, 2015 at 15:56 UTC
    Using XML::XSH2, a wrapper around XML::LibXML:
    #!/usr/bin/perl use strict; use warnings; use XML::XSH2; use Data::Dumper; xsh << 'end.'; open 1.xml ; $h := hash ../../../Seller/@sellerIdFromProvider /DataFeed/SellerInformation/TaxableLocationsCollection/Taxab +leLocation/@locationValue ; end. $_ = [ map $_->getValue, @$_ ] for values %$XML::XSH2::Map::h; # Con +vert attributes to strings. print Dumper($XML::XSH2::Map::h);

    Update: show how to convert from XML::LibXML::Attribute objects to strings.

    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Reading multi-level-tag XML file
by runrig (Abbot) on Jul 27, 2015 at 16:35 UTC
    With XML::Rules:
    use warnings; use strict; use XML::Rules; my %data; my $seller_id; my @rules = ( Seller => sub { $seller_id = $_[1]->{sellerIdFromProvider} }, TaxableLocation => sub { push @{$data{$seller_id}}, $_[1]->{location +Value} }, _default => undef, ); my $xr = XML::Rules->new( rules => \@rules ); $xr->parsefile('file.xml'); use Data::Dumper; print Dumper \%data;
Re: Reading multi-level-tag XML file
by GotToBTru (Prior) on Jul 27, 2015 at 14:40 UTC

    Don't use XML::Simple. Explore using XML::Twig or other, more functional modules.

    Dum Spiro Spero
Re: Reading multi-level-tag XML file
by Hermano23 (Beadle) on Jul 27, 2015 at 14:46 UTC
    First, you should add use strict; and use warnings; to the top of your page, these will provide a lot of valuable info on how to structure your program correctly.

    Storing your strings of locations with the sellerId in a hash is a good idea, that should work for what you need to do next.
Re: Reading multi-level-tag XML file
by Jenda (Abbot) on Oct 13, 2015 at 11:55 UTC

    Just to add another option to an old thread for future reference. If you do not like to use global variables, you can change the rules slightly and make the parser return the data:

    use strict; use XML::Rules; my $xr = XML::Rules->new( stripspaces => 15, rules => { '^ShippingChargesCollection' => 'skip', # skip the <ShippingCharge +sCollection>...</ShippingChargesCollection> TaxableLocation => sub { my ($tag, $attr) = @_; return if $attr->{locationType} ne 'state'; # ignore other loc +ation types return '@states' => $attr->{locationValue}; # push the state i +nto the data of the parent tag }, TaxableLocationsCollection => 'pass no content', # dissolve the Ta +xableLocationsCollection, ignore text content Seller => 'pass', # dissolve the <Seller> tag SellerInformation => sub { my ($tag, $attr) = @_; return $attr->{sellerIdFromProvider} => {name => $attr->{_cont +ent}, states => $attr->{states}}; # sellerIdFromProvider and _content comes from the dissolved < +Seller> # states is an array that comes from <TaxableLocation> # use the id as the hash key }, 'DataFeed' => sub { delete $_[1]->{recordCount}; return $_[1]} # i +gnore the recordCount attribute, return the data from SellerInformati +on }); my $data = $xr->parsefile($filename); use Data::Dumper; print Dumper $data;

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1136471]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2024-04-24 04:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found