Sixtease has asked for the wisdom of the Perl Monks concerning the following question:
Dear monks,
HTML::Parser provides the HTML::Entities::_decode_entities method, which is the lower level peer of HTML::Entities::decode. I use it in my XML::Entities module to do the real work. However, it appears that older versions of HTML::Parser don't handle unicode entities.
# outputs "ř" on 3.56use HTML::Parser; $x = "ř"; HTML::Entities::_decode_entities($x, {}); print "$x\n";
# outputs ř on 3.35
The changelog for HTML::Parser says that by version 3.39_90, the Unicode entities are always treated for perl 5.8+ and that it is "no longer a compile-time directive". However, I found nothing about a directive in the earlier versions.
So, my question is: How can I make the older versions of HTML::Parser treat unicode entities?
use strict; use warnings; print "Just Another Perl Hacker\n";
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Decoding unicode entities with HTML::Parser
by ikegami (Patriarch) on Apr 09, 2008 at 10:09 UTC | |
by Sixtease (Friar) on Apr 09, 2008 at 11:41 UTC | |
Re: Decoding unicode entities with HTML::Parser
by ikegami (Patriarch) on Apr 09, 2008 at 09:18 UTC | |
Re: Decoding unicode entities with HTML::Parser
by Juerd (Abbot) on Apr 09, 2008 at 09:47 UTC |
Back to
Seekers of Perl Wisdom