Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: HTML::Entities and Unicode quotes

by Anonymous Monk
on Aug 20, 2011 at 02:08 UTC ( #921355=note: print w/replies, xml ) Need Help??


in reply to HTML::Entities and Unicode quotes

HTML::Entities isn't correctly handling quotes as defined in this Unicode table.

A poor workman blames his tools :)

http://cpansearch.perl.org/src/GAAS/HTML-Parser-3.68/lib/HTML/Entities.pm
'lsquo;' => chr(8216), 'rsquo;' => chr(8217), 'sbquo;' => chr(8218), 'ldquo;' => chr(8220), 'rdquo;' => chr(8221), 'bdquo;' => chr(8222),

#!/usr/bin/perl -- use strict; use warnings; use utf8; use HTML::Entities; binmode STDOUT, ':encoding(UTF-8)'; { my $line = "This is a test of \xe2\x80\x9cquotes\xe2\x80\x9d\n"; print encode_entities($line, "\200-\377"); # looking for “ & +rdquo; in the output print $line; } print '#' x 11, "\n"; { my $line = 'xThis is a test of '.chr(8220).'quotes'.chr(8222)."\n" +; print encode_entities($line, '\x{201c}\x{201e}'); print $line; } print '#' x 11, "\n"; { my $line = 'xThis is a test of '.chr(8220).'quotes'.chr(8222)."\n" +; print encode_entities($line, chr(8220).chr(8222)); print $line; } __END__ This is a test of “quotes” This is a test of “quotes” ########### xThis is a test of “quotes„ xThis is a test of “quotes„ ########### xThis is a test of “quotes„ xThis is a test of “quotes„

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://921355]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2018-12-16 19:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many stories does it take before you've heard them all?







    Results (71 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!