Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: XMLin question (xmlfixup.pl)

by Anonymous Monk
on Feb 15, 2013 at 19:44 UTC ( #1018953=note: print w/ replies, xml ) Need Help??


in reply to XMLin question

#!/usr/bin/perl -- use strict; use warnings; use HTML::Encoding 'encoding_from_http_message'; use WWW::Mechanize; use Encode; use HTML::Tree; my $file = shift or die " Usage: xmlfixup.pl file:in.xml > out.xml xmlfixup.pl http://example.com/foo.xml > out.utf8.xml "; my $resp = WWW::Mechanize->new( autocheck => 1 )->get( $file ); my $enco = encoding_from_http_message( $resp ); my $utf8; if( $enco ) { $utf8 = decode( $enco => $resp->content ); } else { $utf8 = $resp->content; } my $t = HTML::TreeBuilder->new( qw( ignore_unknown 0 no_space_compacting 1 ignore_ignorable_whitespace 0 implicit_tags 0 no_expand_entities 1 store_comments 1 store_pis 1 ) ); #~ $t->xml_mode( 1 ); $t->parse_content( $utf8 ); binmode STDOUT, ':utf8'; print $_->as_XML for $t->content_list; __END__


Comment on Re: XMLin question (xmlfixup.pl)
Download Code
Replies are listed 'Oldest First'.
Re^2: XMLin question (xmlfixup.pl)
by tmharish (Friar) on Feb 21, 2013 at 12:43 UTC
    Fails when data contains <![CDATA[ ... ]]>
Re^2: XMLin question (xmlfixup.pl)
by tmharish (Friar) on Feb 21, 2013 at 12:45 UTC

    I would like to use this. with a fix I have written for CDATA and a couple of other things, on XML::Smart.

    Please /msg me or reply to this so I can assign credit.

      by Anonymous Monk http://perlmonks.org/?node_id=1018953

        Sadly this breaks for too many cases - am re-writing XML::Smart::HTMLParser ( located also on GitHub )

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1018953]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (11)
As of 2015-07-07 16:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (91 votes), past polls