Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

XML::Simple - Handling inconsistency

by nagalenoj (Friar)
on Jan 09, 2013 at 07:00 UTC ( #1012392=perlquestion: print w/ replies, xml ) Need Help??
nagalenoj has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm using XML::Simple to parse some xml data. I find it works inconsistently for some cases. I'm just trying to make it provide consistent style of output. But, I couldn't get it. Here is my sample program.

use strict; use warnings; use Data::Dumper; use XML::Simple; my $res = "<results> <binding name=\'description\'> <literal xml:lang=\'en\'>First article +. </literal> </binding> </results>"; my $res2 = "<results> <binding name=\'image_url\'> <uri>Football.png</uri> </binding> <binding name=\'description\'> <literal xml:lang=\'en\'>Second articl +e.</literal> </binding> </results>"; my $result = XMLin($res); my $result2 = XMLin($res2); print Dumper $result; print Dumper $result2;

Output it produces is,

$VAR1 = { 'binding' => { 'literal' => { 'content' => 'First article. ', 'xml:lang' => 'en' }, 'name' => 'description' } }; $VAR1 = { 'binding' => { 'image_url' => { 'uri' => 'Football.png' }, 'description' => { 'literal' => { 'content' => 'Sec +ond article.', 'xml:lang' => 'en +' } } } };

If you look at description tag. You could find the difference in the output it produced. I find, it works fine for 'Second article.' in the example. Is there any way to make it give the similar structure?

Thanks.

Comment on XML::Simple - Handling inconsistency
Select or Download Code
Replies are listed 'Best First'.
Re: XML::Simple - Handling inconsistency
by tobyink (Abbot) on Jan 09, 2013 at 10:27 UTC

    You appear to be trying to parse SPARQL result sets. There are some really good RDF/SPARQL modules for Perl, so you needn't be mucking around with XML stuff!

    use strict; use warnings; use RDF::Query::Client; my $query = RDF::Query::Client->new(<<'SPARQL'); PREFIX category: <http://dbpedia.org/resource/Category:> PREFIX dc: <http://purl.org/dc/terms/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT * WHERE { ?resource dc:subject category:English_film_actors . ?resource dc:subject category:Life_peers . ?resource rdfs:label ?name . FILTER ( langMatches(lang(?name), "en") ) } SPARQL my $results = $query->execute('http://dbpedia.org/sparql'); while (my $row = $results->next) { printf( "%s <%s>\n", $row->{name}->literal_value, $row->{resource}->uri, ); }
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: XML::Simple - Handling inconsistency
by Anonymous Monk on Jan 09, 2013 at 09:00 UTC

    Is there any way to make it give the similar structure?

    use XML::Rules , just give xml2XMLRules the most complex detailed xml file you have, and you'll get rules/data like these

    #!/usr/bin/perl -- use strict; use warnings; use Data::Dump; use XML::Rules; my $ta = XML::Rules->new( qw/ stripspaces 8 /, rules => { 'literal' => 'as is', 'binding' => 'as array no content', 'uri' => 'content', 'results' => 'no content' } ); my @xml = ( q{<results> <binding name="image_url"> <uri>Football.png</uri> </binding> <binding name="description"> <literal xml:lang="en">Second article.</literal> </binding> </results> }, q{<results> <binding name="description"> <literal xml:lang="en">First article. </literal> </binding> </results> } ); dd( my$ref = $ta->parsefile( \$_ )) for @xml; __END__ { results => { binding => [ { name => "image_url", uri => "Football.png" }, { literal => { "_content" => "Second article.", "xml:lang" => "e +n" }, name => "description", }, ], }, } { results => { binding => [ { literal => { "_content" => "First article.", "xml:lang" => "en +" }, name => "description", }, ], }, }

    Sure, you could get there with XML::Simple, but I don't bother anymore

    Data::Diver for navigation :)

      It's possible and advised to give xml2XMLRules not one but several xml files. That way you are more likely to catch all optional tag attributes and repeatable tags. If you have a DTD it's better to use dtd2XMLRules.

      <note>xml2XMLRules and dtd2XMLRules are scripts installed with the XML::Rules module. All they do is that they call the XML::Rules::inferRulesFromExample() and XML::Rules::inferRulesFromDTD() subroutines and print the inferred rules. The inferred rules instruct XML::Rules to build a minimal consistent data structure out of your XML (only tags that may be repeated are turned to arrays, only tags that may have attributes are turned to hashes etc.).</note>

      There's a short writeup on this in Simpler than XML::Simple.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

Re: XML::Simple - Handling inconsistency
by vinoth.ree (Prior) on Jan 09, 2013 at 07:56 UTC

    Is it helpful for you ?

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use XML::Simple; my $res = "<results> <binding name=\'image_url\'> <uri>Football.png</uri> </binding> <binding name=\'description\'> <literal xml:lang=\'en\'>First article. </literal> </binding> </results>"; my $res2 = "<results> <binding name=\'description\'> <literal xml:lang=\'en\'>Second article.</literal> </binding> </results>"; my $xml_simple = XML::Simple->new( KeepRoot => 1, KeyAttr => 1, ForceA +rray => 1 ); my $result = $xml_simple->XMLin($res); my $result2 = $xml_simple->XMLin($res2); print Dumper $result; print Dumper $result2;
    Update:

    Adding the output,

    $VAR1 = { 'results' => [ { 'binding' => [ { 'name' => 'image_url', 'uri' => [ 'Football.png' ] }, { 'literal' => [ { 'content' => 'F +irst article. ', 'xml:lang' => ' +en' } ], 'name' => 'description' } ] } ] }; $VAR1 = { 'results' => [ { 'binding' => [ { 'literal' => [ { 'content' => 'S +econd article.', 'xml:lang' => ' +en' } ], 'name' => 'description' } ] } ] };
Re: XML::Simple - Handling inconsistency
by nagalenoj (Friar) on Jan 09, 2013 at 09:57 UTC
    Thanks For the replies. Got it working with the below options.
    my $result = XMLin($res, KeyAttr => {binding => 'name'}, ForceArray => + [ 'binding' ] );

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1012392]
Approved by vinoth.ree
Front-paged by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (10)
As of 2015-07-31 05:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (274 votes), past polls