Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: The unicode / utf8 struggle, part 2: regexes

by mattr (Curate)
on May 22, 2007 at 09:41 UTC ( #616710=note: print w/ replies, xml ) Need Help??


in reply to The unicode / utf8 struggle, part 2: regexes

Hi, The above masterful comments are just that, but since I noticed this module in the CPAN Nodelet I thought I'd mention HTML::Encoding. Apparently it helps you figure out what encoding is coming in at you, using the function mentioned above. Might even work! But I haven't used it myself. Good luck!

HTML::Encoding helps to determine the encoding of HTML and XML/XHTML documents...
use HTML::Encoding 'encoding_from_http_message'; use LWP::UserAgent; use Encode; my $resp = LWP::UserAgent->new->get('http://www.example.org'); my $enco = encoding_from_http_message($resp); my $utf8 = decode($enco => $resp->content);


Comment on Re: The unicode / utf8 struggle, part 2: regexes
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://616710]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (6)
As of 2015-08-01 03:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (285 votes), past polls