Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Why is utf8 flag set after Encode::decode of pure ASCII?

by creamygoodness (Curate)
on Mar 29, 2010 at 19:02 UTC ( #831680=note: print w/ replies, xml ) Need Help??


in reply to Why is utf8 flag set after Encode::decode of pure ASCII?

ASCII strings may follow different paths through the code depending on whether the SVf_UTF8 flag is set, but the end results should be exactly the same. That makes it hard to maintain discipline as to whether the flag should be on or off, and in practice, you can't count on it being one way or the other.

If you have an all-Unicode application or subsystem, sometimes it makes sense to convert the string to an internal UTF8 representation at the boundary as it enters the subsystem, so that you don't have to continually run UTF-8 byte sequence validity checks to see whether the scalar is pure ASCII or contains high 8-byte code points. The easy way to do this is to turn the SVf_UTF8 flag on even if it's an ASCII string. One of my XS distros does this.


Comment on Re: Why is utf8 flag set after Encode::decode of pure ASCII?
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://831680]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2014-10-25 10:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (142 votes), past polls