http://www.perlmonks.org?node_id=1051981


in reply to Problem with utf8 after nearly 4096 bytes

Hi everyone,

@McA

I've already suposed if the problem was the borders of the buffer. But if were it, only the words in the border of the 4k buffer would be affected. Instead of that, every single accented letter is trunked, even if it was in the byte 4k + (4k/2), who is suposed to be in the middle of the buffer.

The use of binmode($fh, ":utf8"); on the filehandle before put him on the array made the algorithm gives no result at all. I'll have to do some tests before have a better explanation of what happened.

@Anonymous Monk

I've tested both, as Random Walk explained and got the same result:

my @list; while(my $l = <$fh>){ push(@list,split(/ /,$l)); }

Also checked the version of CGI.pm. It was 3.52. I updated this on cpan using the command 'r CGI.pm' for version 3.63. Unfortunately, it doesn't solved the problem. I checked the POST_MAX too and it was equal to -1. I think that means unlimited, right?

I have tried other files too, and they didn't work if they have more than 4k.

@Random Walk

Thanks for the explanation. Tried this and didn't work :/ The code I've tried is above.

@Another Anonymous Monk

I've thought this but I don't have any idea on how I could do that. I know a few ways of read files on Perl, but since I've been working with CGI I don't find any other way to do this reading. Any tip?

Thank you all guys,

Vieira.