Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
Problems? Is your data what you think it is?
 
PerlMonks  

XML::Simple and <tag&gt </tag>

by Jenda (Abbot)
on May 18, 2004 at 17:50 UTC ( #354371=perlquestion: print w/ replies, xml ) Need Help??
Jenda has asked for the wisdom of the Perl Monks concerning the following question:

Is there any way to force XML::Simple to keep the spaces?

use strict; use XML::Simple; use Data::Dumper; my $Parser = new XML::Simple( forcearray => [qw(LOCALE)], suppressempty => '', ); my $data = $Parser->XMLin(<<'*END*'); <tag> <subtag> </subtag> <othertag> ahoj </othertag> </tag> *END* print Dumper($data);
prints
$VAR1 = { 'othertag' => ' ahoj ', 'subtag' => '' };
I need the $data->{subtag} eq ' ', not ''.

I tried to encode the space as &#32;, but that did not make any difference. I also tried all other values of the suppressempty option, but that did not help. P.S.: It's quite possible there will be several spaces in the tag, I need them all!

Update: I patched XML::Simple and added a new value for the suppressempty option that does what I need.

Update:Actually the patch is incomplete, I need to get empty strings, not undefs for the completely empty tags. So I've added another option. ' ' = keep the spaces, return '' for empty tags vs. '0' = keep the spaces, return undef for empty tags. I'll post the updated patch later. If anyone has a better idea what values to use, I'm all ears :-)

Jenda
Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
   -- Rick Osborne

Comment on XML::Simple and &lt;tag&gt &lt;/tag&gt;
Select or Download Code
Re: XML::Simple and &lt;tag&gt &lt;/tag&gt;
by Aragorn (Curate) on May 18, 2004 at 18:26 UTC
    From the documentation:
    WHERE TO FROM HERE? XML::Simple is able to present a simple API because it makes so +me assumptions on your behalf. These include: o You're not interested in text content consisting only of wh +itespace o ...
    So it seems you need some more sophisticated parser, like XML::Parser or XML::LibXML.

    Arjen

      XML::Simple is not a parser! XML::Simple uses XML::SAX or XML::Parser if they are installed, and in this order of preference.

      The questions is that by default a content of a node that only have spaces is just ignored, or we will always have content if the XML tree is idented.

      Graciliano M. P.
      "Creativity is the expression of the liberty".

        XML::Simple is not a parser! XML::Simple uses XML::SAX or XML::Parser if they are installed, and in this order of preference.
        Indeed. My bad.
        The questions is that by default a content of a node that only have spaces is just ignored...
        Using the SuppressEmpty option, you can specify how an element containing only whitespace is ignored, and not if it is ignored.
        or we will always have content if the XML tree is idented.
        I don't understand this part of your reply.

        Referring to the question of the OP: "Is there any way to force XML::Simple to keep the spaces?". The answer is "No, there isn't". Elements consisting of only empty spaces are ignored, and the only thing you can control is whether they appear in the resulting hash as an empty string, undef, or not at all.

        Arjen

Re: XML::Simple and &lt;tag&gt &lt;/tag&gt;
by Ryszard (Priest) on May 19, 2004 at 10:25 UTC
    If you have control over the source XML document/stream you could possibly use some kind of substitution and a regex..

    If you use a placeholder then quite easily you could do a s/\~placeholder/ /g.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://354371]
Approved by kvale
Front-paged by Courage
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (8)
As of 2014-04-20 11:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (485 votes), past polls