CGI.pm does not decode or encode. The $q->charset method only sets the character set for the Content-Type header.
This means that you have to decode and encode manually (e.g. by using PerlIO layers). Decode everything you got, and encode everything you're about to send.
URL encoded data is byte data, typically without a way to indicate which encoding was used. With POST requests, a charset attribute may be present with the Content-Type: application/x-www-form-urlencoded, but the standard does not require it, or tell you what the default is. In fact, most often, even if it is present, it is ignored.
Query strings and form data are usually encoded with the same encoding (charset) that was used on the HTML page that has the form, but it may not be. My advice for those who have standardized on UTF-8, is to try UTF-8 decoding first, and if it's not valid UTF-8, to use ISO-8859-1 instead.
Note that you MUST NOT use "utf8" when decoding CGI data. It does not actually decode, and as such skips sanity checks. It may cause internal corruption and security bugs. Instead of "utf8", use "UTF-8".
|