I was using encode() correctly. I found my mistake - and it was a strange one!
I was using encode() in a recursive function (to convert my datastructure entirely to iso-8859-15).
sub recode {
my ($enc, $data) = @_;
my $ref = ref($data);
if ($ref) {
if ($ref eq 'ARRAY') {
# Array-Ref
my @array = map { recode($enc, $_) } @$data;
return \@array;
} elsif ($ref eq 'SCALAR') {
# Scalar-Ref
my $scalar = recode($enc, $data);
return \$scalar;
} elsif ($ref eq 'HASH') {
# Hash-Ref
my %hash = ();
while (my ($key, $value) = each %$data) {
$hash{recode($enc, $key)} = recode($enc, $value);
}
return \%hash;
} else {
# Object - XYZ::, ZYX::
if ($ref =~ /^(XYZ|ZYX)::/) {
my $object = bless({}, $ref);
while (my ($key, $value) = each %$data) {
$object->{recode($enc, $key)} = recode($enc, $valu
+e);
}
return $object;
} else {
warn "recode(): $ref nicht unterstützt";
return $data;
}
}
} else {
# unbedingt Variable verwenden, sonst wird
# das UTF8-Flag nicht gelöscht
$data = Encode::encode($enc, $data);
return $data;
}
}
(Please ignore the German comments.)
The problem was the last else-block. I was using "return Encode::encode($enc, $data);" without a variable assignment. And this code did not clear the utf8 flag! Only with a variable assignment the utf8 flag was cleared.
I don't know why this happens. I tried to reproduce this behaviour to report a bug, but I failed:
#!/usr/bin/perl -w
use strict;
use Devel::Peek;
use Encode ();
my $str = 'Übel';
Dump($str);
$str = Encode::decode('iso-8859-15', $str);
Dump($str);
my $str2 = do_encode($str);
Dump($str2);
my $str3 = do_encode_with_tmp($str);
Dump($str3);
sub do_encode {
my $text = shift;
return Encode::encode('iso-8859-15', $text);
}
sub do_encode_with_tmp {
my $text = shift;
my $tmp = Encode::encode('iso-8859-15', $text);
return $tmp;
}
Here both subs clear the utf8 flag correctly....
Bye,
Uwe |