Beefy Boxes and Bandwidth Generously Provided by pair Networks Joe
P is for Practical
 
PerlMonks  

Re^3: question on encoding

by graff (Chancellor)
on Jan 24, 2007 at 18:40 UTC ( [id://596374]=note: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.


in reply to Re^2: question on encoding
in thread question on encoding

i want to distinguish between french word and english word, only do the decode, encode operation when it is french.

You want something like this, then:

s/%([0-9a-f]{2}/chr(hex($1))/egi; if ( /[\x80-\xff]/ ) { push @new_words, encode( "iso-8859-1", decode_utf8( $_ )); } else { push @new_words, $_; }
The point there is that you only need to do the encoding conversion if the string happens to contain any bytes with the 8th bit set (i.e. bytes in the numeric range 128-255).

Update: be aware that for this sort of approach, if the input data happen to contain any characters that are not in the iso-8859-1 table (e.g. certain "smart quote" characters, or Greek or Russian or ...), you'll get "?" instead of the intended characters as a result of the "encode(iso-8859-1)" call. That's just a limitation you have to live with if you have to stick with that old "legacy" iso-8859 encoding.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://596374]
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.