Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^7: Unicode strings internals

by kennethk (Monsignor)
on May 10, 2013 at 22:13 UTC ( #1033048=note: print w/ replies, xml ) Need Help??


in reply to Re^6: Unicode strings internals
in thread [SOLVED] Unicode strings internals

It sounds like your bug would only rear its head when $id actually contains non-ASCII characters. The canonical method for handling this, as I understand it, is to explicitly encode incoming text streams that are potentially problematic; i.e.

my ($id, $filename) = split (/\t/, $record); $id = encode ("UTF-8", $id);
I'd watch out for the 'filtering programmer input' trap in all this; the Perl philosophy of giving people as much rope as they like means that a properly-motivated foolish programmer can always outwit your filtering. Since you expect that $id is printable ASCII, I'd more inclined to filter using my regex above, and re-examine the logic the introduced UTF encoding sensitivity into the code in the first place. YMMV, of course.

#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.


Comment on Re^7: Unicode strings internals
Select or Download Code
Re^8: Unicode strings internals
by vsespb (Hermit) on May 10, 2013 at 22:32 UTC
    It sounds like your bug would only rear its head when $id actually contains non-ASCII characters.
    No! ASCII only - letters and digits. Just like in example of my original posting:
    my $utfstring = "123 \x{439}\x{439}\x{439}\x{439}"; my ($ascii_but_utf, undef) = split ' ', $utfstring;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1033048]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (8)
As of 2014-08-01 07:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (257 votes), past polls