in reply to Re: Jargon relating to Perl strings
in thread Jargon relating to Perl strings

now $x consists of a single byte.

Yes, that's what I call a byte. So maybe it's my definition, not my term that's unclear.

which is defined in terms of substr, for which the UTF8 flag *does* matter

Ah, there's the problem. "The UTF8 flag doesn't matter" means different things to us. For a given string, substr will always return the same value regardless of the UTF8 flag, so I say the UTF8 doesn't matter to substr.

my $flag_is_0 = "\xC9ric"; utf8::downgrade($flag_is_0); my $flag_is_1 = "\xC9ric"; utf8::upgrade($flag_is_1); say substr($flag_is_0, 0, 1) eq substr($flag_is_1, 0, 1) ?1:0; # 1

I shall endeavor to find something clearer.

just do mind that not all people share your definition

Thus this post. If I refer them to this post, they can understand what I say even if their definitions are different.

Replies are listed 'Best First'.
Re^3: Jargon relating to Perl strings
by JavaFan (Canon) on Jan 18, 2012 at 09:23 UTC
    Right, we do mean something different with "the UTF-8 doesn't matter". I interpret that as the only difference between the internal representation of the strings is whether the UTF-8 flag is set or not -- but you use it to mean "it doesn't matter whether the internal encoding is UTF-8 or not".
    use Devel::Peek; my $x = my $y = "\xC9ric"; utf8::upgrade($x); utf8::upgrade($y); utf8::encode($y); Dump($x); Dump($y); # Now $x and $y differ only in the setting of the UTF-8 flag say substr($x, 0, 1) eq substr($y, 0, 1) ? "equal" : "different"; __END__ SV = PV(0x8cd80cc) at 0x8cea9ec REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0x8cef6e8 "\303\211ric"\0 [UTF8 "\x{c9}ric"] CUR = 5 LEN = 9 SV = PV(0x8cd803c) at 0x8ceaa28 REFCNT = 1 FLAGS = (PADMY,POK,pPOK) PV = 0x8cf38b8 "\303\211ric"\0 CUR = 5 LEN = 9 different
      This has already been addressed in the OP, in case you didn't notice.