in reply to HTML::Strip and UTF8 -- is there some way I can just skip all the "UTF8 only" entities?
Answering my own question (partially), I think I have to do something along the lines of
use strict; use warnings; use Encode::Encoder; my $utf8String="\x{2019}"; my $latin1String = latin1ify($utf8String); print "$latin1String\n"; sub latin1ify { my $string = shift || ""; Encode::encode( "iso-8859-1" , Encode::decode_utf8($string) ); }
which gives "?" and then strip the question marks.
But I have to go now, so I'll finish this another time.
|
---|
In Section
Seekers of Perl Wisdom