thanks for the reply and explanation. the insertion works now.
a minor problem: i want to distinguish between french word and english word, only do the decode, encode operation when it is french. but the regex /%[0-9A-Fa-f]{2}/ doesn't catch them.
if ( /%[0-9A-Fa-f]{2}/ ) {
# 1.
# my $escaped = uri_unescape( $_ ); same effect as the RE
+but slower
s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
# 2.
my $s = decode_utf8( $_ );
# 3.
$s = encode("iso-8859-1", $s);
push @new_words, $s;
} else {
push @new_words, $_;
}
in the case of 'énfasis' i peeked into the url submission and the data arrived to the perl program. they are different: it is %C3%A9nfasis during submission. but it becomes énfasis after i grab the value through CGI.pm's param method.
for now, i am taking off the if .. else part and doing encode/decode_utf8 on every word i received, not a good solution i felt.