|Perl: the Markov chain saw|
Character sets: converting to UTF8 with Perl 5.6?by webfiend (Vicar)
|on Oct 31, 2000 at 02:11 UTC||Need Help??|
webfiend has asked for the wisdom of the Perl Monks concerning the following question:
I am writing a CGI to take requests from a user and poll an assortment of online resources to find a range of possible results (think of a Meta-search engine, and you're on the right track). It then proceeds to dissect the resulting HTML from each of these resources, and presents it to the user in a single, unified interface.
Most of it works quite smoothly, thanks to the magic of LWP::Parallel. There is one little snag, though. The result may span multiple languages and character sets, and I need to put everything in a character set that is capable of presenting everything from Western Latin to Shift JIS. I'm guessing that UTF8 (or UTF16) will work for that purpose.
Now that I've got the setup out of the way, it's time for the question itself:
Assuming I am able to determine a string's original encoding, how would I convert that string to UTF8 (or some other encoding)?
Any kind of information would be helpful, including pointers to modules and documentation for me to "RTFM" :)
"All you need is ignorance and confidence; then success is sure." -- Mark Twain