Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

[Already reported Perl Bug] Confusion over utf8 and in-memory filehandles

by ribasushi (Monk)
on Dec 28, 2012 at 00:14 UTC ( #1010601=perlquestion: print w/replies, xml ) Need Help??
ribasushi has asked for the wisdom of the Perl Monks concerning the following question:

UPDATE (solved fsvo)

It turns out this a known bug, which surprisingly has not been fixed yet. In the meantime "do not do this if it hurts" seems to be the best course of action :(


Greetings venerable monks,

While I think (humbly) I have a rather good grasp of how unicode is handled by perl in and out, I find myself stumped by the following example:

perl -e ' my $str = { map { $_ => "\x{A9}" } qw(byte char) }; utf8::upgrade($str->{char}); for (keys %$str) { open (my $fh, "<", \do{$str->{$_}}); printf( "$_ is read as %s\n", unpack "H*", <$fh>); } printf "Strings are: %s\n", ($str->{byte} eq $str->{char} ? "equal" : "different") ; '

I understand why "char" and "byte" are considered equal. What I do not understand is why the internal storage details "leak" through the in-memory filehandle.

Explanations welcome!

  • Comment on [Already reported Perl Bug] Confusion over utf8 and in-memory filehandles
  • Download Code

Replies are listed 'Best First'.
Re: Confusion over utf8 and in-memory filehandles
by Anonymous Monk on Dec 28, 2012 at 00:56 UTC

    What do you mean, how do they leak through?

    byte is read as a9 char is read as c2a9 Strings are: equal

    \xa0 is U+00A9 is ord 169, c2a9 is ord 169 utf8-encoded, after you decode it, it is chr 169

      It is the same string with the utf8 flag flipped up. As such I would expect the same bytes to be available when reading it as a "filehandle". Yet the utf8-ness (which is claimed to be an internal impl. detail all ovetr the docs) is "visible" in this case.

      I am not sure I understand why (nor can find any relevant perldoc)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1010601]
Front-paged by Arunbear
[ambrus]: nysus: make sure you enter at least two lines for the title and then preview, and read warnings the preview form prints
[davies]: There is an option (I have it set) to force preview before submission. Perhaps your option has been set (accidentally?) and you are not expecting it.
[ambrus]: sorry, I mean at least two words for the title and then preview
[ambrus]: davies: that option is the default. and it's not really "force", it just hides the button.
[marto]: yes, not reading the errors displayed has been a cause of this type of report in the past
[nysus]: Ambrus, ah. I think that was the problem. I'll try. Thanks!
[davies]: Ambrus: you've missed it by a couple of weeks, but consider next year's London Perl Workshop.

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (7)
As of 2017-12-15 11:35 GMT
Find Nodes?
    Voting Booth?
    What programming language do you hate the most?

    Results (431 votes). Check out past polls.