Re: XMLin question
by Anonymous Monk on Feb 15, 2013 at 17:51 UTC
|
Is there a way to get around this error? Yes, fix your xml so it is xml because as it is written now it doesn't qualify for xml, it is broken, any < > in content needs to be escaped, but you knew that already, its FAQ after all
| [reply] |
|
| [reply] |
|
| [reply] |
|
perl -pe "%f = qw{ < < > > }; s{[^><]\K([><])([^\w])}{ $f{$1}$2}
+g; " < in > out
| [reply] [d/l] |
|
|
|
Re: XMLin question
by ww (Archbishop) on Feb 15, 2013 at 17:52 UTC
|
A1: Use an html entity - < - perhaps (not sure that's valid) or unicode, U+003C (60) .
A2: Well beyond the scope of this reply.... and probably a question asked only in frustration. If you really want an answer, suggest p5p.
If you didn't program your executable by toggling in binary, it wasn't really programming!
| [reply] [d/l] [select] |
|
A2: Well beyond the scope of this reply.... and probably a question asked only in frustration. If you really want an answer, suggest p5p. How is it a question for p5p? I think it is not
| [reply] |
|
Because those active on p5p are among the possible sources of an answer to an extremely esoteric question; an answer that probably would otherwise require the OP have the experience, knowledge and patience to profit by digging thru perlguts and who-knows-what-else.
On the other hand, if you, Anonymonk, can answer it here, I'll be delighted to upvote a brief and correct exposition that's accessible to those of less than Perl Porter stature (like /me).
If you didn't program your executable by toggling in binary, it wasn't really programming!
| [reply] |
|
|
|
Re: XMLin question (xmlfixup.pl)
by Anonymous Monk on Feb 15, 2013 at 19:44 UTC
|
#!/usr/bin/perl --
use strict;
use warnings;
use HTML::Encoding 'encoding_from_http_message';
use WWW::Mechanize;
use Encode;
use HTML::Tree;
my $file = shift or die "
Usage: xmlfixup.pl file:in.xml > out.xml
xmlfixup.pl http://example.com/foo.xml > out.utf8.xml
";
my $resp = WWW::Mechanize->new( autocheck => 1 )->get( $file );
my $enco = encoding_from_http_message( $resp );
my $utf8;
if( $enco ) {
$utf8 = decode( $enco => $resp->content );
} else {
$utf8 = $resp->content;
}
my $t = HTML::TreeBuilder->new(
qw(
ignore_unknown 0
no_space_compacting 1
ignore_ignorable_whitespace 0
implicit_tags 0
no_expand_entities 1
store_comments 1
store_pis 1
)
);
#~ $t->xml_mode( 1 );
$t->parse_content( $utf8 );
binmode STDOUT, ':utf8';
print $_->as_XML for $t->content_list;
__END__
| [reply] [d/l] |
|
Fails when data contains <![CDATA[ ... ]]>
| [reply] [d/l] |
|
I would like to use this. with a fix I have written for CDATA and a couple of other things, on XML::Smart.
Please /msg me or reply to this so I can assign credit.
| [reply] [d/l] |
|
| [reply] |
|
|
|