|We don't bite newbies here... much|
Read (sysread) binary data into utf8 stringby vr (Curate)
|on Apr 03, 2017 at 16:53 UTC||Need Help??|
vr has asked for the wisdom of the Perl Monks concerning the following question:
A "binary" file for us:
Not sure if it's a bug or not.
Note that if the filehandle has been marked as :utf8 , Unicode characters are read instead of bytes (the LENGTH, OFFSET, and the return value of sysread are in Unicode characters)
Does this imply, that if FH has not been marked, OFFSET is treated as bytes? Then, possibly, utf8 becomes invalid?
I think that if OFFSET was 0, then string utf8-ness should match file's IO encoding layer. I.e., read produces same result as slurping, above. Regardless of content of original scalar. And, if OFFSET was not zero, then? It should be documented more clearly, perhaps. About combinations that should never be used.
BTW, it looks like it's about this bug. Tk passes file name as utf8, this parameter is (rather recklessly) re-used (!) to receive file content.