Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

XMLin question

by stryda42 (Novice)
on Feb 15, 2013 at 17:49 UTC ( [id://1018925]=perlquestion: print w/replies, xml ) Need Help??

stryda42 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

Just being brief here.

When using XMLin to parse my xml file, I received error, 'parser error : xmlParseStartTag: invalid element name',

which it refers to the less than sign: <. I purposely have the sign for a reason. Making up some xml content:

<sample> <begin> You must be < 18. </begin> </sample>

Is there a way to get around this error?

I see XMLout has NoEscape=1, but why not XMLin?

Thanks,
Stryda

Replies are listed 'Best First'.
Re: XMLin question
by Anonymous Monk on Feb 15, 2013 at 17:51 UTC

    Is there a way to get around this error?

    Yes, fix your xml so it is xml because as it is written now it doesn't qualify for xml, it is broken, any < > in content needs to be escaped, but you knew that already, its FAQ after all

      Ok, thank you

      I was curious if there was some way to auto-convert < to < as well as tags like this in general instead of manually doing this, from an end user perspective.

        I was curious if there was some way to auto-convert < to < as well as tags like this in general instead of manually doing this, from an end user perspective.

        :| You don't say :/

        perl -pe "%f = qw{ < &lt; > &gt; }; s{[^><]\K([><])([^\w])}{ $f{$1}$2} +g; " < in > out
Re: XMLin question
by ww (Archbishop) on Feb 15, 2013 at 17:52 UTC

    A1: Use an html entity - &lt; - perhaps (not sure that's valid) or unicode, U+003C (60) .

    A2: Well beyond the scope of this reply.... and probably a question asked only in frustration. If you really want an answer, suggest p5p.


    If you didn't program your executable by toggling in binary, it wasn't really programming!

      A2: Well beyond the scope of this reply.... and probably a question asked only in frustration. If you really want an answer, suggest p5p.

      How is it a question for p5p? I think it is not

        Because those active on p5p are among the possible sources of an answer to an extremely esoteric question; an answer that probably would otherwise require the OP have the experience, knowledge and patience to profit by digging thru perlguts and who-knows-what-else.

        On the other hand, if you, Anonymonk, can answer it here, I'll be delighted to upvote a brief and correct exposition that's accessible to those of less than Perl Porter stature (like /me).


        If you didn't program your executable by toggling in binary, it wasn't really programming!

Re: XMLin question (xmlfixup.pl)
by Anonymous Monk on Feb 15, 2013 at 19:44 UTC
    #!/usr/bin/perl -- use strict; use warnings; use HTML::Encoding 'encoding_from_http_message'; use WWW::Mechanize; use Encode; use HTML::Tree; my $file = shift or die " Usage: xmlfixup.pl file:in.xml > out.xml xmlfixup.pl http://example.com/foo.xml > out.utf8.xml "; my $resp = WWW::Mechanize->new( autocheck => 1 )->get( $file ); my $enco = encoding_from_http_message( $resp ); my $utf8; if( $enco ) { $utf8 = decode( $enco => $resp->content ); } else { $utf8 = $resp->content; } my $t = HTML::TreeBuilder->new( qw( ignore_unknown 0 no_space_compacting 1 ignore_ignorable_whitespace 0 implicit_tags 0 no_expand_entities 1 store_comments 1 store_pis 1 ) ); #~ $t->xml_mode( 1 ); $t->parse_content( $utf8 ); binmode STDOUT, ':utf8'; print $_->as_XML for $t->content_list; __END__
      Fails when data contains <![CDATA[ ... ]]>

      I would like to use this. with a fix I have written for CDATA and a couple of other things, on XML::Smart.

      Please /msg me or reply to this so I can assign credit.

        by Anonymous Monk http://perlmonks.org/?node_id=1018953

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1018925]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2024-03-19 03:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found