Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: sed character codes

by ikegami (Patriarch)
on Mar 30, 2006 at 01:35 UTC ( [id://540090]=note: print w/replies, xml ) Need Help??


in reply to sed character codes

If you wish to search/replace for a UTF-8 sequence, you'll need a string in UTF-8 format. Encode is the module to use to convert the string to UTF-8. Then, you can search for the bytes using /\xC0\xBF/.

Of course, if the string was read in as ASCII or another single-byte encoding, it should already be in UTF-8, so you should be able to use /\xC0\xBF/ already.

At least, that's how I understand things. I don't have much experience in this area.

Replies are listed 'Best First'.
Re^2: sed character codes
by kettle (Beadle) on Mar 30, 2006 at 01:54 UTC
    thanks! that was exactly what I was looking for. I just needed the formatting convention, which appears to be:

    \x[A-Z0-9]{2}\x[A-Z0-9]{2}

    More generally, do you (or does anyone else) happen to know where I could find this information for other character encodings?? joe

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://540090]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2024-04-24 08:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found