Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^4: UTF-8: Trying to make sense of form input

by creamygoodness (Curate)
on Aug 16, 2009 at 05:41 UTC ( #788975=note: print w/replies, xml ) Need Help??


in reply to Re^3: UTF-8: Trying to make sense of form input
in thread UTF-8: Trying to make sense of form input

I think you're right that the OP needs to grasp the mental model you've laid out.

But I predict that until the OP masters debugging the encoding -- which requires understanding the role of the UTF8 flag -- problems are going to keep cropping up. If there were an "encoded/decoded" flag that you could check, that would be lovely. Since no such flag exists, you need to be able to look at the raw string and the presence/absence of the UTF8 flag in Devel::Peek to see what's going wrong.

There are simply too many opportunities to mess up. Forget a binmode() here, omit (or include) a -utf8 argument there, forget to set pg_enable_utf8 on your DBD::Pg db handle, pass something through YAML::Syck without setting $YAML::Syck::ImplicitUnicode, and so on.

In short... documentation and Hungarian notation are too unreliable :) -- because the underlying system is too hard to control from a high level.

IMO, the only way to achieve high reliability for UTF-8 is to write tests.

use Test::More tests => 1; my $smiley = "\x{263a}; my $maybe = round_trip($smiley); is( $maybe, $smiley, "String survives round trip including UTF8 flag" );

PS: You updated your node multiple times over the half hour or so after it was posted, forcing me to keep rewriting my reply. :(

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://788975]
help
Chatterbox?
[talexb]: Wasn't Brexit the name of the first Santana album?
[tye]: That's funny, saying getlogin() is less secure. For some uses, it is far superior to getpwuid(), precisely for reasons of security.
[LanX]: search strategy => Perl Functions by Category
[Corion]: ... I found discussion of the who utility and that it uses the POSIX 2008 getlogin function (but that function was available much earlier and thus even exists in Perl, as a search for getlogin on CPAN brought up Perl 5.26 as first hit
[tye]: oh, LanX, but I was thinking that it was not a function that Perl provided.
[Corion]: Yeah, I also went a more roundabout way, just to find that the solution had been with Perl all along! ;)
[tye]: perhaps the "less secure" comment was motivated by old versions of getlogin() and trolled through the 'last' log trying to match your TTY. On modern Unix, I believe getlogin() just returns a fundamental bit of identity from your process.
[tye]: (Because every thing you do has that tag available for auditd.)
[tye]: Though it is certainly true that you should not use getlogin() for auth().
[LanX]: tye: just a tip for the next time, I found interesting things there...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (6)
As of 2017-06-23 18:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many monitors do you use while coding?















    Results (554 votes). Check out past polls.