I have a nasty looking translation statment that "de-accents"
Latin 1 characters. I wrote it for indexing and searching web-pages
in an accent free way. The important thing to
remember is that you must de-accent both the search-term
and the text being searched against. Also that this
code will only work for the Latin-1 character set.
Here's my tr statment (I'm not using the code
tag on purpose 'cause othewise the line will be too long. I've also
inserted spaces to help the wrapping... you should remove them
if you use this statment):
tr/\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD
\xCE\xCF\xD0\xD1\xD2\xD3\xD4\xD5\xD6\xD8\xD9\xDA\xDB\xDC
\xDD\xDF\xE0\xE1\xE2\xE3\xE4\xE5\xE6\xE7\xE8\xE9\xEA\xEB
\xEC\xED\xEE\xEF\xF1\xF2\xF3\xF4\xF5\xF6\xF8\xF9\xFA\xFB
\xFC\xFD\xFF/\x41\x41\x41\x41\x41\x41\x41\x43\x45\x45\x45
\x45\x49\x49\x49\x49\x44\x4E\x4F\x4F\x4F\x4F\x4F\x4F\x55
\x55\x55\x55\x59\x73\x61\x61\x61\x61\x61\x61\x61\x63\x65
\x65\x65\x65\x69\x69\x69\x69\x6E\x6F\x6F\x6F\x6F\x6F\x6F
\x75\x75\x75\x75\x79\x79/;
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|