Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

XML::Parser and numeric entities

by gam3 (Curate)
on Jan 14, 2010 at 01:32 UTC ( #817319=perlquestion: print w/ replies, xml ) Need Help??
gam3 has asked for the wisdom of the Perl Monks concerning the following question:

Is there a way to keep XML::Parser from converting numeric entities into UTF8?

Or is there some other parser that will let me do this?

use strict; use XML::Parser; use vars qw($parser); sub handle_start { my $self = shift; my $x = shift; print "<" . $x . '>' ; } sub handle_end { my $self = shift; my $x = shift; print "</" . $x . '>' ; } sub handle_char { my $self = shift; my $x = shift; print $x; } $parser = XML::Parser->new( Handlers => { Start => \&handle_start, End => \&handle_end, Char => \&handle_char } ); $parser->parse(<<XML); <start>&#8211;</start> XML
I would like this program to output
<start>&#8211;</start>
not
<start></start>
-- gam3
A picture is worth a thousand words, but takes 200K.

Comment on XML::Parser and numeric entities
Select or Download Code
Replies are listed 'Best First'.
Re: XML::Parser and numeric entities
by ikegami (Pope) on Jan 14, 2010 at 02:53 UTC

    It simply decodes the entities. It doesn't then encode the character using UTF-8.

    If you want all non-ASCII characters encoded, you can use:

    use HTML::Entities qw( encode_entities_numeric ); sub handle_char { my $self = shift; my $x = shift; print encode_entities_numeric($x); }

    There's also a handler you can use instead of Char that receives the entities still encoded, but then you're not guaranteed to have all non-ASCII characters encoded.

      Thank you for that information, I can use it to patch up my problem

      However what I really want is for XML::Parser to NOT decode the numeric entities at all.

      -- gam3
      A picture is worth a thousand words, but takes 200K.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://817319]
Approved by herveus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (18)
As of 2015-07-28 19:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (258 votes), past polls