Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re: Playing with "funny" chars

by mischief (Hermit)
on Sep 27, 2004 at 12:57 UTC ( #394148=note: print w/replies, xml ) Need Help??

in reply to Playing with extended chars

You might also want to look at Text::Iconv.

Replies are listed 'Best First'.
Re^2: Playing with "funny" chars
by itub (Priest) on Sep 27, 2004 at 13:12 UTC
    (oops, I wanted to reply to the first post but clicked here by accident ;) ).

    My recommendation is to use perl 5.8.0 or more recent and look at perldoc Encode, perldoc open, and perldoc -f open. If tr doesn't work because you have the characters encoded in two bytes, you can do

    $s = decode_utf8($s);

    That will convert the string into the internal representation where characters are characters and you don't have to worry about how many bytes they need for encoding.

      I think the problem is not on the string (I'm using perl5.8.5, because 5.8.0 had some bugs in RedHat), but on the tr operator itself.

      The first attemp works like this:

      perl -e '$_="";tr//aeiou/;print' aeaoauauau
      It seems that "" is treated as two characters, maybe "" and "a", and each one get one different matching char ( "a" and "e").

      BTW, encode and decode functions return values that make me think that the string is well formed, and that is tr// who's making wrong things. Am I too lost?

        If you have utf8 encoded strings in your program file, you need to use the utf8 pragma (see perldoc utf8).

        use utf8; $s = 'holáéíóúon'; $s =~ tr/áéíóú/aeiou/; print $s; # prints holaeiouon
        The code above may show the double characters explicitly since is served as ISO-8859-1.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://394148]
LanX starts googling the punch line ...
[choroba]: suddenly, a weird noise can be heard, and the conductor stops them
[choroba]: (sorry, was interrupted by a call)
[choroba]: What was that? asks he
[choroba]: and the triangle player says, "That was me. It's written here in the score"
[choroba]: and shows a bar with a note "ohne Triangel"
[choroba]: where "ohne" means "will bend" in Czech
[LanX]: one-day-the- triangle-player-of -an-orchestra-gets -very-sick xD
[choroba]: :)

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (10)
As of 2017-03-27 11:57 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (319 votes). Check out past polls.