in reply to Re: question on encoding
in thread question on encoding
thanks for the reply and explanation. the insertion works now.
a minor problem: i want to distinguish between french word and english word, only do the decode, encode operation when it is french. but the regex /%[0-9A-Fa-f]{2}/ doesn't catch them.
in the case of 'énfasis' i peeked into the url submission and the data arrived to the perl program. they are different: it is %C3%A9nfasis during submission. but it becomes énfasis after i grab the value through CGI.pm's param method.if ( /%[0-9A-Fa-f]{2}/ ) { # 1. # my $escaped = uri_unescape( $_ ); same effect as the RE +but slower s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg; # 2. my $s = decode_utf8( $_ ); # 3. $s = encode("iso-8859-1", $s); push @new_words, $s; } else { push @new_words, $_; }
for now, i am taking off the if .. else part and doing encode/decode_utf8 on every word i received, not a good solution i felt.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^3: question on encoding
by graff (Chancellor) on Jan 24, 2007 at 23:40 UTC | |
Re^3: question on encoding
by Anonymous Monk on Jan 25, 2007 at 05:14 UTC |
In Section
Seekers of Perl Wisdom