Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^2: Is there any XML reader like this?

by BrowserUk (Pope)
on Jan 13, 2012 at 23:06 UTC ( #947847=note: print w/ replies, xml ) Need Help??


in reply to Re: Is there any XML reader like this?
in thread Is there any XML reader like this?

Sorry, but that is just so much BS. You simply need to add one simple option:

C:\test>junk44 #! perl -slw use strict; use Data::Dump qw[ pp ]; use XML::Simple; my $xml = XMLin( \*DATA, ForceArray => 1 ); pp $xml; __DATA__ <servers> <station18> <ip>10.0.0.101</ip> <ip>10.0.1.101</ip> <ip>10.0.0.102</ip> <ip>10.0.0.103</ip> <ip>10.0.1.103</ip> </station18> <station19> <ip>10.0.0.111</ip> <ip>10.0.1.111</ip> <ip>10.0.0.112</ip> <ip>10.0.0.113</ip> <ip>10.0.1.113</ip> </station19> <station17> <ip>10.0.0.121</ip> </station17> </servers>

Produces:

{ station17 => [{ ip => ["10.0.0.121"] }], station18 => [ { ip => [ "10.0.0.101", "10.0.1.101", "10.0.0.102", "10.0.0.103", "10.0.1.103", ], }, ], station19 => [ { ip => [ "10.0.0.111", "10.0.1.111", "10.0.0.112", "10.0.0.113", "10.0.1.113", ], }, ], }

Which is still far simpler than wasting your time trying to figure out how use those complex monsters.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?


Comment on Re^2: Is there any XML reader like this?
Select or Download Code
Re^3: Is there any XML reader like this?
by ikegami (Pope) on Jan 13, 2012 at 23:24 UTC

    I'm have no idea why you call XML::LibXML a monster compared to XML::Simple.

    use XML::Simple qw( :strict XMLin ); local $XML::Simple::PREFERRED_PARSER = 'XML::Parser'; my $stations = XMLin( \*DATA, ForceArray => 1, KeyAttr => [] ); for my $station_name (keys %$stations) { say $station_name; my $station = $stations->{$station_name}[0]; for my $ip (@{ $station->{ips} // [] }) { say " $ip"; } }
    use XML::LibXML qw( ); my $root = XML::LibXML->load_xml( IO => \*DATA )->documentElement; for my $station ($root->findnodes('*')) { say $station->getName; for my $ip ($station->findnodes('ip')) { say " ".$ip->textContent; } }

    And that's not even mentioning the fact that XML::LibXML is 20x faster* and able to handle so much more stuff than XML::Simple (including every day stuff).

    * — That assumes XML::Parser is used as XML::Simple's backend. XML::LibXML is 10,000x faster than XML::Simple's common default of XML::SAX::PurePerl (which handles encodings really badly).

    Update: Fixed an error in XML::Simple code.
    Update: Fixed an error in XML::LibXML code. ("IO" was mispelled, and the XPath was wrong.)

      I'm have no idea why you call XML::LibXML a monster compared to XML::Simple.

      Here's one reason:

      XML::LibXML->load: specify location, string, or IO at C:\test\xml1.pl +line 7

      This is line 7:

      my $root = XML::LibXML->load_xml( fh => \*DATA )->documentElement;

      So now you've got to wade through the 32 separate pages of XML::LibXML POD to work out why!

      I never have that problem with XML::Simple.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

      And here's another reason. Once you've fixed your first error, your code prints nothing at all

      my $root = XML::LibXML->load_xml( IO => \*DATA )->documentElement; for my $station ($root->findnodes('servers/*')) { say $station->name; for my $ip ( $station->findnodes('ip') ) { say " ".$ip->textContent; } }

      No values. No errors. Nothing! Nada! Zitch! Zip! Not a jot!

      Why? You'll have to go back and wade through those 32 pages again to work that out!


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        The documentation for the Parser is all in one page, not 32. The second error was an XPath error. Fixed. At least they were documented an easy to find. Note that I had as many mistakes in the XML::Simple version first.
        Yes!
        Nothing is printing!

        Thanks,
        Ashok
      And that's not even mentioning the fact that XML::LibXML is 20x faster

      BTW. Even that factually correct claim only tells half the story. Generate a simple and fairly modest XML file using this:

      #! perl -slw use strict; $|++; our $S //= '999'; our $I //= 10; open O, '>', 'junk.xml'; print O '<servers>'; for my $s ( '0001' .. $S ) { printf "\r%s", $s; print O "<station$s>"; print O '<ip>', join('.', unpack 'C4', pack 'N', int( rand 2**32 ) + ), '</ip>' for 1 .. $I; print O "</station$s>"; }; print O '</servers>'; close O;

      Like this:

      C:\test>xmlgen -S=9999 9999 C:\test>dir junk.xml 15/01/2012 12:40 2,424,205 junk.xml

      Now run XML::Simple & XML::LibXML scripts that parse that file and iterate the contents and time them:

      C:\test>xmllib junk.xml Parsing took 0.290895 seconds Iteration took 171.657306 seconds Total took 171.959000 seconds Check mem:63.6MB C:\test>xmlsimple junk.xml Parsing took 38.202000 seconds Iteration took 0.059186 seconds Total took 38.262577 seconds Check mem:142MB

      All the time you gained during parsing, you throw away four-fold when accessing the data through the nightmare interface of OO baloney.

      And if you double the file size:

      C:\test>xmlgen -S=19999 19999 C:\test>dir junk.xml 15/01/2012 12:58 4,868,440 junk.xml

      And now LibXML takes 8 times as long:

      C:\test>xmllib junk.xml Parsing took 0.560000 seconds Iteration took 676.238758 seconds Total took 676.802000 seconds Check mem:107MB C:\test>xmlsimple junk.xml Parsing took 75.078000 seconds Iteration took 0.124583 seconds Total took 75.209615 seconds Check mem:254MB

      Increase the file size 10-fold and LIbXML will take 100 time longer.

      Now look carefully at the split times. XML::Simple's parsing time is slow, but linear with the file size. It's traversal time is extremely fast and also linear.

      Conversely, LibXML's parsing time is very fast and linear; but it's traversal time is horribly slow and quadratic with the file size.

      It is easy to see which one wins in the speed stakes.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        Not an especially compelling case without posting the source code for the "XML::Simple & XML::LibXML scripts that parse that file and iterate the contents".

        It is easy to see which one wins in the speed stakes.

        Yeah, LibXML. My tests *included* the time it took to extract the data from the tree. The test was done with real world data of various size from three different providers.

        We use XML::Bare with a thin layer to compensate for it's awful interface (XML::Simple without ForceArray or any other option), its expectation of getting decoded text, and it's lack of namespace support. It's slightly faster when you factor in the time it takes to extract data. Not nearly as capable as libxml, and we had to create an interface just to be able to use it.

Re^3: Is there any XML reader like this?
by tobyink (Abbot) on Jan 14, 2012 at 20:43 UTC

    You simply need to add one simple option

    And that helps you for precisely five minutes until someone adds this to the file:

      <ip assignment="temporary">10.0.0.101</ip>
    

    And then all your code which assumes stations have IP addresses which are arrayrefs of strings breaks again.

      then all your code which assumes stations have IP addresses which are arrayrefs of strings breaks again.

      Nope. This:

      #! perl -slw use strict; use Data::Dump qw[ pp ]; use XML::Simple; my $xml = XMLin( \*DATA, ForceArray => [ 'ip' ], NoAttr => 1 ); pp $xml; __DATA__ <servers> <station18> <ip>10.0.0.101</ip> <ip>10.0.1.101</ip> <ip>10.0.0.102</ip> <ip>10.0.0.103</ip> <ip>10.0.1.103</ip> </station18> <station19> <ip>10.0.0.111</ip> <ip>10.0.1.111</ip> <ip>10.0.0.112</ip> <ip>10.0.0.113</ip> <ip>10.0.1.113</ip> </station19> <station17> <ip assignment="temporary">10.0.0.101</ip> <ip>10.0.0.121</ip> </station17> </servers>

      Produces this::

      C:\test>junk44 { station17 => { ip => ["10.0.0.101", "10.0.0.121"] }, station18 => { ip => [ "10.0.0.101", "10.0.1.101", "10.0.0.102", "10.0.0.103", "10.0.1.103", ], }, station19 => { ip => [ "10.0.0.111", "10.0.1.111", "10.0.0.112", "10.0.0.113", "10.0.1.113", ], }, }

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        I don't dispute that there is yet another XML::Simple option that can fix things again.

        My point is that things appear to work for a while, and then a small change to your input data breaks everything, so you need to alter your code. And then another change to the input data requires another code change.

        One of the major reasons people use config files is so that they can avoid having to make changes to running code when, say, a new IP address is assigned.

      Which may very well be the right thing to do. The format of the data changed, is it really safe to snip out the bits we did expect and ignore the rest?

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

      Getting the following errorNot an ARRAY reference at ProcessStatus.pl line 44. when I have this option that will have stations mixed like:
      <station20> <user>netcool</user> <process assignment="temporary">some text</process> </station20> <station19> <user>netcool</user> <process>nco_objserv</process> <process>nco_p_mttrapd</process> </station19>

      Code snippet....
      my $xml = XMLin("PROCESS.CONF"); ...... foreach $process ( @{ $xml->{$server}{process} } ) .....
      Anything missing here?

      Thanks,
      Ashok
        Appreciate any help here!

        Thanks,
        Ashok

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://947847]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (17)
As of 2014-08-27 17:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (247 votes), past polls