in reply to Re^5: Getting mad with CGI::Application and utf8
in thread Getting mad with CGI::Application and utf8
So as the average John Doe Perl hacker, what should I use to find out if a certain module or sub returns text strings or binary strings?
Warning: culture shock ahead.
How can I determine if a string is a text string or a binary string?
You can't. Some use the UTF8 flag for this, but that's misuse, and makes well behaved modules like Data::Dumper look bad. The flag is useless for this purpose, because it's off when an 8 bit encoding (by default ISO-8859-1) is used to store the string.
This is something you, the programmer, has to keep track of; sorry. You could consider adopting a kind of "Hungarian notation" to help with this.
There is no way to determine whether a string is binary or text. Every operation (including your own subroutines) should handle a single mode: either text or binary. If you want to handle both kinds of string, and for any reason need to know the difference between bytes and characters with the same ordinal values, you will have to specify multiple routines, or a way to indicate that a certain string is binary rather than text.
Just an advance warning: you may want to argue that this is as stupid concept, but eventually you'll have to accept that Perl just works like this. I personally think the model is well thought through.
See also this journal post and the discussion tree that follows it. I plan to release a module called BLOB that lets you (and everyone else) flag a string as "this is binary, not text".