http://www.perlmonks.org?node_id=1012392

nagalenoj has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm using XML::Simple to parse some xml data. I find it works inconsistently for some cases. I'm just trying to make it provide consistent style of output. But, I couldn't get it. Here is my sample program.

use strict; use warnings; use Data::Dumper; use XML::Simple; my $res = "<results> <binding name=\'description\'> <literal xml:lang=\'en\'>First article +. </literal> </binding> </results>"; my $res2 = "<results> <binding name=\'image_url\'> <uri>Football.png</uri> </binding> <binding name=\'description\'> <literal xml:lang=\'en\'>Second articl +e.</literal> </binding> </results>"; my $result = XMLin($res); my $result2 = XMLin($res2); print Dumper $result; print Dumper $result2;

Output it produces is,

$VAR1 = { 'binding' => { 'literal' => { 'content' => 'First article. ', 'xml:lang' => 'en' }, 'name' => 'description' } }; $VAR1 = { 'binding' => { 'image_url' => { 'uri' => 'Football.png' }, 'description' => { 'literal' => { 'content' => 'Sec +ond article.', 'xml:lang' => 'en +' } } } };

If you look at description tag. You could find the difference in the output it produced. I find, it works fine for 'Second article.' in the example. Is there any way to make it give the similar structure?

Thanks.

Replies are listed 'Best First'.
Re: XML::Simple - Handling inconsistency
by tobyink (Canon) on Jan 09, 2013 at 10:27 UTC

    You appear to be trying to parse SPARQL result sets. There are some really good RDF/SPARQL modules for Perl, so you needn't be mucking around with XML stuff!

    use strict; use warnings; use RDF::Query::Client; my $query = RDF::Query::Client->new(<<'SPARQL'); PREFIX category: <http://dbpedia.org/resource/Category:> PREFIX dc: <http://purl.org/dc/terms/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT * WHERE { ?resource dc:subject category:English_film_actors . ?resource dc:subject category:Life_peers . ?resource rdfs:label ?name . FILTER ( langMatches(lang(?name), "en") ) } SPARQL my $results = $query->execute('http://dbpedia.org/sparql'); while (my $row = $results->next) { printf( "%s <%s>\n", $row->{name}->literal_value, $row->{resource}->uri, ); }
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: XML::Simple - Handling inconsistency
by Anonymous Monk on Jan 09, 2013 at 09:00 UTC

    Is there any way to make it give the similar structure?

    use XML::Rules , just give xml2XMLRules the most complex detailed xml file you have, and you'll get rules/data like these

    #!/usr/bin/perl -- use strict; use warnings; use Data::Dump; use XML::Rules; my $ta = XML::Rules->new( qw/ stripspaces 8 /, rules => { 'literal' => 'as is', 'binding' => 'as array no content', 'uri' => 'content', 'results' => 'no content' } ); my @xml = ( q{<results> <binding name="image_url"> <uri>Football.png</uri> </binding> <binding name="description"> <literal xml:lang="en">Second article.</literal> </binding> </results> }, q{<results> <binding name="description"> <literal xml:lang="en">First article. </literal> </binding> </results> } ); dd( my$ref = $ta->parsefile( \$_ )) for @xml; __END__ { results => { binding => [ { name => "image_url", uri => "Football.png" }, { literal => { "_content" => "Second article.", "xml:lang" => "e +n" }, name => "description", }, ], }, } { results => { binding => [ { literal => { "_content" => "First article.", "xml:lang" => "en +" }, name => "description", }, ], }, }

    Sure, you could get there with XML::Simple, but I don't bother anymore

    Data::Diver for navigation :)

      It's possible and advised to give xml2XMLRules not one but several xml files. That way you are more likely to catch all optional tag attributes and repeatable tags. If you have a DTD it's better to use dtd2XMLRules.

      <note>xml2XMLRules and dtd2XMLRules are scripts installed with the XML::Rules module. All they do is that they call the XML::Rules::inferRulesFromExample() and XML::Rules::inferRulesFromDTD() subroutines and print the inferred rules. The inferred rules instruct XML::Rules to build a minimal consistent data structure out of your XML (only tags that may be repeated are turned to arrays, only tags that may have attributes are turned to hashes etc.).</note>

      There's a short writeup on this in Simpler than XML::Simple.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

Re: XML::Simple - Handling inconsistency
by vinoth.ree (Monsignor) on Jan 09, 2013 at 07:56 UTC

    Is it helpful for you ?

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use XML::Simple; my $res = "<results> <binding name=\'image_url\'> <uri>Football.png</uri> </binding> <binding name=\'description\'> <literal xml:lang=\'en\'>First article. </literal> </binding> </results>"; my $res2 = "<results> <binding name=\'description\'> <literal xml:lang=\'en\'>Second article.</literal> </binding> </results>"; my $xml_simple = XML::Simple->new( KeepRoot => 1, KeyAttr => 1, ForceA +rray => 1 ); my $result = $xml_simple->XMLin($res); my $result2 = $xml_simple->XMLin($res2); print Dumper $result; print Dumper $result2;
    Update:

    Adding the output,

    $VAR1 = { 'results' => [ { 'binding' => [ { 'name' => 'image_url', 'uri' => [ 'Football.png' ] }, { 'literal' => [ { 'content' => 'F +irst article. ', 'xml:lang' => ' +en' } ], 'name' => 'description' } ] } ] }; $VAR1 = { 'results' => [ { 'binding' => [ { 'literal' => [ { 'content' => 'S +econd article.', 'xml:lang' => ' +en' } ], 'name' => 'description' } ] } ] };
Re: XML::Simple - Handling inconsistency
by nagalenoj (Friar) on Jan 09, 2013 at 09:57 UTC
    Thanks For the replies. Got it working with the below options.
    my $result = XMLin($res, KeyAttr => {binding => 'name'}, ForceArray => + [ 'binding' ] );