![]() |
|
No such thing as a small change | |
PerlMonks |
Mapping ACCEPT_LANG, USER_AGENT & GeoIP to Encode's character setsby cosmicperl (Chaplain) |
on Jun 22, 2012 at 02:29 UTC ( [id://977749]=perlquestion: print w/replies, xml ) | Need Help?? |
cosmicperl has asked for the wisdom of the Perl Monks concerning the following question:
Hi All, For a while now it's been my job to deal with user uploads. The users are globally disperse, often have very limited computer skills, and upload text files that were created in various locales. It's very challenging to "Do The Right Thing". Encode::Detect helps a lot, and works great a lot of the time. I convert everything to UTF-8, so from then on there aren't any issues... Well, until the user exports and doesn't get how to open the file in UTF-8 mode (depending on what program they are using). But I'm not worried much about exports right not. I'm well aware that it doesn't matter where the user is from, as potentially the file they are uploading could be from any locale. But I've found that for our users at least, it's pretty consistent where they are from to what locale their uploads tend to be in. For example, Norwegian users using Mac tend to upload files in MacIcelandic locale, Russian Windows users Windows-1251, etc. So what I'm going to do is use HTTP_USER_AGENT, GeoIP and HTTP_ACCEPT_LANGUAGE to give me a best guess at locale for when Encode::Detect gets it wrong. This'll likely be displayed to the user with translation examples so that they can chose the charset that works. For the life of me I cannot find on google any examples of people doing this, or any modules for this kind of mapping on CPAN. Am I missing something? Otherwise I may as well create a new CPAN module for this, so that others in my situation may benefit. Lyle
Back to
Seekers of Perl Wisdom
|
|