Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: incremental reading of utf8 input handles

by The Code Captain (Initiate)
on Jul 09, 2012 at 18:13 UTC ( #980736=note: print w/ replies, xml ) Need Help??


in reply to incremental reading of utf8 input handles

I don't think this is a problem - depending on which version of perl you are using, and provided that you are consistently using UTF8 in all code. (You don't have to use the same language but you do have to use the same character set.)

From perlunicode:

Beginning with version 5.6, Perl uses logically-wide characters to represent strings internally. Starting in Perl 5.14, Perl-level operations work with characters rather than bytes within the scope of a use feature 'unicode_strings' (or equivalently use 5.012 or higher). (This is not true if bytes have been explicitly requested by use bytes, nor necessarily true for interactions with the platform's operating system.)

Whenever I have used UTF8 I have not had a problem with buffers splitting, because perl itself knows that the buffer holds characters, and how many bytes are required to represent the character. Just make sure that you are consistently using the UTF8 character set.


Comment on Re: incremental reading of utf8 input handles

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://980736]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2014-07-31 11:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (248 votes), past polls