http://www.perlmonks.org?node_id=1016658


in reply to minimalist perl-utf8 question

G'day, Didier,

I don't know what you were expecting. I'll assume you expected to see either 5 or 10 output from both scripts although maybe you expected something else - please clarify.

The utf8 pragma refers to characters in the source code - the documentation is very clear about this. So, when your e-acute characters are part of the source, what you have here is as to be expected.

For what it's worth, I'm using a Mac Pro and the same version of Perl as you:

$ perl -E 'say length q{ייייי}' 10 $ perl -E 'use utf8; say length q{ייייי}' 5

If the e-acute characters are external to the source code, use utf8; will have no effect:

$ perl -E 'say length $ARGV[0]' ייייי 10 $ perl -E 'use utf8; say length $ARGV[0]' ייייי 10

You might also like to take a look at the length function which also has some information regarding this issue.

-- Ken

Replies are listed 'Best First'.
Re^2: minimalist perl-utf8 question
by Anonymous Monk on Feb 02, 2013 at 09:08 UTC
    Thanks for explanations.

    I was expecting 5 for the length and 5 e-accutes for the string, whatever the coding of the characters.