in reply to Re^9: UTF8 versus \w in pattern matching
in thread UTF8 versus \w in pattern matching

If you then add use Encode; and change the last line to print encode('UTF-8',Dumper($a)); (like you should when using an UTF_8 terminal), then you'll get $VAR1 = 'tón';

Assuming the real code is going to use more than one print statement, this suggestion will require calling encode() for every print, which is not DRY programming. Alternative: use the binmode function, as binmode STDOUT, ':encoding(UTF-8)'; , sometime before any print statements, and just use normal print statements (like print Dumper($a);) throughout. This lets the I/O layer handle the translation from Perl's internal representation to UTF-8-encoded output.

Replies are listed 'Best First'.
Re^11: UTF8 versus \w in pattern matching
by ikegami (Pope) on Jul 06, 2021 at 21:01 UTC

    Too many moving parts!!! One should be using the following here:

    local $Data::Dumper::Useqq = 1; print(Dumper($a));

    Fix the problems until you get the correct string (one that contains "\x{e9}" or "\351" for "é"). Then worry about the output to the terminal.

    Seeking work! You can reach me at ikegami@adaelis.com