Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^3: XML::Rules parsing inside out?

by Anonymous Monk
on Dec 07, 2017 at 00:39 UTC ( #1205069=note: print w/replies, xml ) Need Help??


in reply to Re^2: XML::Rules parsing inside out?
in thread XML::Rules parsing inside out?

Hi

So, um, why not use those rules? Then work with the resulting hash?

*tip* ?node_id=3989;BIT=XML%3A%3ARules-%3Enew;HIT=xml ... Re: XML::LibXML drives me to drinking

This might be something like what you were attempting

#!/usr/bin/perl -- use strict; use warnings; use XML::Rules; use Data::Dump qw/ dd /; my $rawxml = q{<?xml version="1.0" encoding="UTF-8"?> <root> <summary> <item> <value>1.0</value> </item> </summary> <detail1> <item> <value>2.0</value> </item> </detail1> <detail2> <item> <value>3.0</value> </item> </detail2> <value> 11 </value> </root> }; dd( XML::Rules->new( rules => [], )->parse( $rawxml ) ); dd( XML::Rules->inferRulesFromExample( $rawxml ) ); dd( XML::Rules->new( rules => XML::Rules->inferRulesFromExample( $rawx +ml ), )->parse( $rawxml ) ); my ( $summary, $detail1, $detail2 ) ; my $xr = XML::Rules->new( qw/ stripspaces 8 /, rules => { 'detail1,detail2,item,root,summary' => sub { return; }, 'value' => [ '/root/summary/item' => sub { ( $summary, $detail1, $detail2 ) = (); #reset $summary = $_[1]->{_content}; return; }, '/root/detail1/item' => sub { $detail1 = $_[1]->{_content}; return; }, '/root/detail2/item' => sub { $detail2 = $_[1]->{_content}; warn "$summary $detail1 $detail1\n"; return; }, sub { die "unexpected 'value' at ".join('/','',@{$_[2]}) } +, ], }, ); my $ret = $xr->parse( $rawxml ); dd( $ret ); __END__ $ perl xml-rules-1205065.pl { root => { _content => "\n\n\n\n\n", detail1 => { _content => "\n \n", item => { _content => "\n \n ", value => { _cont +ent => "2.0" } }, }, detail2 => { _content => "\n \n", item => { _content => "\n \n ", value => { _cont +ent => "3.0" } }, }, summary => { _content => "\n \n", item => { _content => "\n \n ", value => { _cont +ent => "1.0" } }, }, value => { _content => " 11 " }, }, } { "detail1,detail2,item,root,summary" => "no content", "value" => "content", } { root => { detail1 => { item => { value => "2.0" } }, detail2 => { item => { value => "3.0" } }, summary => { item => { value => "1.0" } }, value => " 11 ", }, } 1.0 2.0 2.0 unexpected 'value' at /root at xml-rules-1205065.pl line 54.

Replies are listed 'Best First'.
Re^4: XML::Rules parsing inside out?
by bfdi533 (Friar) on Dec 07, 2017 at 22:15 UTC

    I suppose the real reason I do not want to use the rules it provides is that is is then no better than XML::Simple.

    The "real" XML is much more complicated and means I have to reference items 6 or 7 levels deep with some labels as long as 38 characters long. So, it would be something like $hash->{'SomeVeryLongCollectionName'}->{'AnotherLowerLevelOfItems'}->{'Summary'}->{'Collections'}->{'Collection'}->{'Item'}->{'Value'} which is VERY unattractive.

    The use of XML::Rules allows me, if done properly, to build my own hash and not have to deal with all of those levels and structure which are just unwieldly.

    And using the rules as specified, and as you showed in your example, I still have to have the full, very long path to deal with in order to figure out which item I am dealing with but maybe gets me closer. I will try some variations of your code and see where it gets me ...

      The results of xml2XMLRules or dtd2XMLRules are just a starting point :-)

      Sometimes they are enough, sometimes you decide to change them to ignore some tags, take just the content ignoring attributes or use an attribute as the hash key and sometimes you add custom rules that'll let you filter or massage the data further or handle the twigs of the XML.

      Especially if the XML is much more complicated, it's better to start with one of them (preferably the later, if you do have a DTD) and then tweak the rules instead of starting with a clean plate.

      This is also why I provide those two as executables, not primarily as methods to call right before the parsing.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1205069]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2021-02-25 06:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?