Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^2: Problem with utf8 after nearly 4096 bytes

by McA (Priest)
on Sep 02, 2013 at 07:34 UTC ( #1051897=note: print w/replies, xml ) Need Help??


in reply to Re: Problem with utf8 after nearly 4096 bytes
in thread Problem with utf8 after nearly 4096 bytes

Hi,

it's just a guess from my side what "Anonymous Monk" wanted to avoid with his recommendation. When you read a file block wise it could happen that the last byte in your buffer is the first byte of a two or more byte representation of a character. An example, the German has the following UTF-8 representation: 0xc3 0x84

xxxxxx 0xc3|0x84 xxxxxx -----------^ End of buffer

This could lead to decoding errors. But when you read AND DECODE linewise you can be pretty sure that all bytes read until NL (or whatever your line ending is) can be decoded properly.

Putting a decoding layer to your filehandle should also work with the handle you get from an upload, so

binmode($fh, ":utf8");

should be valid too.

McA

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1051897]
help
Chatterbox?
[marto]: LanX I'd rather commit sudoku :P
[marto]: trench humour folks, feeling rough.
[marto]: there is never a night when they sleep all the way through, but last night was something else
[marto]: gave in at 3:45 ish and let Charlie watch videos about spitfires, the battle of Britain. He's plane crazy
[1nickt]: marto soothing!
[marto]: Hiromi makes an appearance, let's hope Jools doesn't ruin it by joining in on the old Joanna
[1nickt]: karlgoethebier What is the issue with the semic-colon after the ellipsis? It's documented as proper syntax ...;
[karlgoethebier]: Crazy? What should i say? Son just started his third studies. Next week i have a session with my therapist ;-)

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2017-11-18 18:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:













    Results (277 votes). Check out past polls.

    Notices?