Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: How to get paired values from the nested XML structure?

by Jenda (Abbot)
on Nov 01, 2011 at 19:01 UTC ( #935186=note: print w/ replies, xml ) Need Help??


in reply to How to get paired values from the nested XML structure?

Would you like the data structure like this?

{ 'Entity B' => { 'Boss' => 'Name', 'unitnumber' => '2', 'contactinfo' => { 'URL' => undef, 'email' => undef, 'Telefon' => { 'telnumber' => '456', 'directcall' => '78910', 'code' => '0999' }, 'Address' => { 'zip' => '11111', 'Street' => 'SomeOtherSt +reet', 'Ort' => 'City', 'Building' => '2' }, 'Fax' => { 'telnumber' => '456', 'directcall' => '10987', 'code' => '0999' } }, 'products' => { 'E5722' => 'few', 'C8099' => '522', 'F3596' => 'few', 'B0765' => '988', 'A1136' => '1982', 'D3938' => 'few' } }, 'Entity A' => { 'Boss' => 'Name', 'unitnumber' => '1', 'contactinfo' => { ...
use strict; use XML::Rules; use Data::Dumper; my $parser = XML::Rules->new( rules => { 'excerpt' => 'pass no content', 'Address,Fax,Telefon,contactinfo,products' => 'no content', 'Boss,Building,Name,Ort,Street,URL,art_code,code,directcall,emai +l,quantity,quantity_small,telnumber,unitnumber,zip' => 'content', 'article' => sub { if (exists $_[1]->{quantity_small}) { return #'%article' =>{ $_[1]->{art_code} => 'few' # }; } else { return #'%article' => { $_[1]->{art_code} => $_[1]->{quantity} #}; } }, 'unit' => 'no content by Name', } ); my $data = $parser->parse(\*DATA); print Dumper($data); __DATA__ <excerpt> <unit> <unitnumber>1</unitnumber> ...

The base set of rules was generated by: perl -MData::Dumper -MXML::Rules -e "print Dumper(XML::Rules::inferRulesFromExample( 'c:\temp\excerpt.xml'))"

Jenda
Enoch was right!
Enjoy the last years of Rome.


Comment on Re: How to get paired values from the nested XML structure?
Select or Download Code
Re^2: How to get paired values from the nested XML structure?
by vagabonding electron (Hermit) on Nov 01, 2011 at 19:43 UTC
    Thank you Jenda,
    I must read this carefully und try it. I did not know XML::Rules bevor. The good news - this module exists for ActivePerl.
    The "real" huge xml file consists of many nested structures like in the example. They "dive" from the surface of simple data such as the "address" or the "boss name" (and the "unit_id").
    I had hence an idea to make several csv files with id of the unit (here in example shown as unit name) and connect them in the database later. This eclectic (promiscuous? :-)) idea comes since my knowledge of perl is limited and I have to get the things run at the same time.

      If the file is huge you can process it in parts. In this case and with XML::Rules it would mean that the rule for <unit> would be a subroutine that inserts the data of the unit to database and then returns nothing. That way you do not keep the already processed data in memory.

      Another good module for processing huge XML files is XML::Twig.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

        Thank you Jenda, I will certainly try it too.
        The Module XML::Twig is also available for ActivePerl - great!(and btw. it is the real blessing to be able to install ppm modules manually, without administrator privileges and without proxy issue).
        Many thanks!
        VE

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935186]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2014-09-03 00:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (34 votes), past polls