Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

comments on xml 2 hash 2 xml using libXML

by irishBatman (Novice)
on May 28, 2012 at 23:43 UTC ( #972892=perlquestion: print w/replies, xml ) Need Help??
irishBatman has asked for the wisdom of the Perl Monks concerning the following question:

I have been looking for examples of converting hash's to XML and back again. Initially for overlaying setting's onto objects. I started using XML::Simple, but was discouraged as at some point I may wish to embed binary data in my XML processing functions. I thought it would be bettter to start using LibXML and build on that!

I have come up with 2 example scripts, one that show's coversion of a hash to XML, the other that reads this input and prepares it for overlay on a hash. If the scripts are executed in order the first one should provide the input to the second.

These are only a starting point and not my final functions that I need to create, however they provide all the basic function that I require. I was wondering what the Monks thought of these though and before I proceed any further if there was another approach that may work better?

use strict; use XML::LibXML; my %testHash; ## create a test hash $testHash{root}{testEl1}{currentValue} = 0; $testHash{root}{testEl1}{comment} = "System test element 1"; $testHash{root}{testEl1}{enable} = 1; $testHash{root}{testEl2}{currentValue} = 0; $testHash{root}{testEl2}{comment} = "System test element 2"; $testHash{root}{testEl2}{enable} = 1; $testHash{root}{testEl2}{newLevel}{enable1} = 1; $testHash{root}{testEl2}{newLevel}{enable2} = 1; $testHash{root}{testEl2}{newLevel}{enable3} = 1; $testHash{root}{testEl3}{newLevel}{enable} = 1; $testHash{root}{testEl2}{reference} = $testHash{root}{testEl1}; ## call the hash processor! my $string = myHash2Xml( \%testHash ); ## create the output XML open FILEOUT, ">myHash2Xml.xml"; print FILEOUT "$string\n"; close FILEOUT; ## end of script ## covert our hash to XML, this is a wrapper to create the ## document and pass it back as a string sub myHash2Xml { my ($hash) = @_; ## create a new xml document! my $xmlDoc = XML::LibXML::Document->new( '1.0', 'UTF-8' ); ## create the root element and a pointer to it! my $root = $xmlDoc->createElement("testDoc"); $xmlDoc->setDocumentElement($root); examineHash( $hash, \$xmlDoc, \$root ); # process the hash my $xmlString = $xmlDoc->toString(1); print "$xmlString\n"; return $xmlString; } ## wrapper for the recurser, sub examineHash { my ( $hash, $xmlDoc, $lastElement ) = @_; my %refsHash; # hash ref counter, use +d to prevent/detect circular refs examineHashRecurse( \%refsHash, $hash, $xmlDoc, $lastElement ); + # here we go } ## recurser! sub examineHashRecurse { my ( $refsHash, $hash, $xmlDoc, $lastElement ) = @_; foreach my $key ( sort { $a cmp $b } keys( %{$hash} ) ) { + # go through keys at current level if ( ( $$hash{$key} . "" ) =~ /HASH\(/ ) { + # is it another hash? ## its another hash, go deeper! if ( !exists $$refsHash{ $$hash{$key} } ) { + # check if its in the reference hash $$refsHash{ $$hash{$key} } = 1; # print "create element $key\n"; my $newElement = $$xmlDoc->createElement($key); + # create new element $$lastElement->appendChild($newElement); + # add element to our doc under the last element ## remember to pass the new element as a reference. examineHashRecurse( $refsHash, $$hash{$key}, $xmlDoc, +\$newElement ); } ## else, this has alread been examined! stops circula +r continuation else { # do something with a circular reference??? $$refsHash{ $$hash{$key} }++; } } else { # print "create attribute $key with value " . $ +$hash{$key} . "\n"; $$lastElement->setAttribute( "$key", $$hash{$key} ); # +add attribute to the last element! } } }
use strict; use XML::LibXML; my $file; $file = 'myHash2Xml.xml'; ## load the XML my $parser = XML::LibXML->new(); my $tree = $parser->parse_file($file); my $root = $tree->getDocumentElement; ## get all the elements that exist in a doc. Note the wildcard to sear +ch through everything. my @allDocElements = $root->getElementsByTagName('*'); ## some stats meters my $useableElements = 0; my $useableAtts = 0; my $count = @allDocElements; ## iterate over all the elements foreach my $el1 (@allDocElements) { ## if we have no child nodes we are at the bottom of the tree. Thi +s is what we ## will be looking for most of the time. if ( !$el1->hasChildNodes() ) { if ( $el1->hasAttributes() ) { ## iterate over all the attributes $useableElements++; foreach my $ttt ( $el1->attributes() ) { $useableAtts++; my $string = $el1->nodePath() . "/" . $ttt->localName +. " = " . $ttt->nodeValue; print "AT - $string\n"; } } else { ## haven't hit this yet $useableElements++; my $string = $el1->nodePath() . "/--" . $el1->localName . +" = " . $el1->nodeValue; print "EL - $string\n"; } } else { ## if we are not at the bottom of the tree, we could still hav +e attributes. Check here ## if we do and process as above if ( $el1->hasAttributes() ) { $useableElements++; foreach my $ttt ( $el1->attributes() ) { $useableAtts++; my $string = $el1->nodePath() . "/" . $ttt->localName +. " = " . $ttt->nodeValue; print "AT - $string\n"; } } else { ## keep an eye out for text added to elements if ( $el1->textContent() !~ /\n/ ) { $useableElements++; my $string = $el1->nodePath() . " = " . $el1->textCont +ent; print "TX - $string\n"; } } } } print "\nFrom $count we found $useableElements useable elements, with +$useableAtts attributes\n";

Replies are listed 'Best First'.
Re: comments on xml 2 hash 2 xml using libXML
by space_monk (Chaplain) on Mar 18, 2013 at 14:29 UTC

    I found this when I was browsing for a similar function myself, so I'm sorry for arriving late to the party! :-) I'm just posting this in case anyone else is looking for a similar answer.

    There are a couple of CPAN modules which appear to do a similar job. Specifically XML::Hash and XML::Hash::LX.

    Both modules predate the 2012 date of the question by a couple of years.

    A Monk aims to give answers to those who have none, and to learn from those who know more.

      No module that aims to convert between XML and hashrefs does a good job. This is simply because the data model of XML is nothing like a hash.

      How do you represent this as a hash?

      <root toot="1"> 2 <toot>3</toot> 4 <toot>5</toot> 6 </root>

      You either end up with a hopelessly complicated hash/array structure to represent it unambiguously:

      { root => { attributes => { toot => 1, }, contents => [ 2, { toot => { contents => [3] } }, 4, { toot => { contents => [5] } }, 6 ], } }

      ... which is a nightmare to find stuff in. Or you do this:

      { name => "root", toot => [1,3,5], text => [2,4,6], }

      ... and stop caring about stuff like the distinction between attributes and elements and text nodes, the order of a node's children, etc... which means that when you start outputting XML, the XML you generate will confuse the hell out of any tools that consume it.

      Which is not to say that particular flavours of XML - e.g. RSS or Atom or OPML or blah or blah - cannot be usefully converted to a hash by a module that understands the schema. If you know that an Atom <entry> element can never validly have a title attribute, but will always have exactly one <title> element as a child, then representing that as:

      my $entry = { title => "...", ..., };

      ... is fine. But modules like XML::Simple and its ilk don't target specific flavours of XML; they try to handle generic XML.

      Even the very best generic XML-to-hash module will be horribly broken, because the whole concept is horribly broken.

      package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name

        Yes, fortunately I'm going from a hash to XML, but recognise that it will have issues.....

        A Monk aims to give answers to those who have none, and to learn from those who know more.
      How do they compare?

        I'll let you know in a few weeks time, as I'll probably be evaluating them for use in the project I'm working on. Its only my first day in the office today....

        Update: I was going to post a request for help because I couldn't get XML::Hash::LX to build in Cygwin; however installing "make" seemed to fix that one.... :-P

        A Monk aims to give answers to those who have none, and to learn from those who know more.
Re: comments on xml 2 hash 2 xml using libXML
by space_monk (Chaplain) on Mar 20, 2013 at 08:28 UTC
    N.B. Because hashes are essentially unordered, it is perhaps a good idea to tie your hash (perhaps using Tie::Hash::Indexed in order to ensure your hash produces XML in the right order.
    A Monk aims to give answers to those who have none, and to learn from those who know more.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://972892]
Approved by kcott
[LanX]: ... understood everything after mentally write down what he said.
[choroba]: Trainspotting?
[choroba]: Ouch, trailing commas in JSON :-(
[LanX]: ... though there seems to exist a taxation for the letter 't' in Northern Bri''ain ;-)
[LanX]: never seen trainspotting in OV ... :(
[karlgoethebier]: LanX
[LanX]: They should teach Scottish in continental schools after Brexit !!!

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (8)
As of 2018-02-19 10:44 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (261 votes). Check out past polls.