Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Case shifting on accented characters

by mwhiting (Beadle)
on Sep 12, 2013 at 19:15 UTC ( #1053786=perlquestion: print w/replies, xml ) Need Help??
mwhiting has asked for the wisdom of the Perl Monks concerning the following question:

How do I case shift accented characters? I want to take "LES MIS╔RABLES" and change it to "LES MISÚRABLES". This is so that I can do a regex comparison against that string. I don't need to shift the rest of the characters because I can do a case insensitive comparison on the rest of it (\\i), but that doesn't work on the accented characters.

I tried the lc function, but it just gives me "LES MISRABLES"

Replies are listed 'Best First'.
Re: Case shifting on accented characters
by ikegami (Pope) on Sep 12, 2013 at 20:50 UTC

    //i does work on accented characters ...usually. When it doesn't, you can force it to using one of the following methods:

    A very likely possibility is that you don't actually have "Ú" or "╔" in your string or in your code due to forgetting to decode, since you don't normally need the above.

    use utf8; # Source file is encoded using UTF-8 print "Ú" =~ /╔/i ?1:0,"\n"; # 1 print "╔" =~ /Ú/i ?1:0,"\n"; # 1 print "Ú" =~ /\w/ ?1:0,"\n"; # 1 print "╔" =~ /\w/ ?1:0,"\n"; # 1

    To answer your question, you could go about doing that by lowercasing non-ASCII characters using s/([^\x00-\x7F])/lc($1)/eg; with one of the above used.

    use utf8; # UTF-8 code use open ':std', ':encoding(UTF-8)'; # UTF-8 terminal use 5.012; $_ = "LES MIS╔RABLES"; s/([^\x00-\x7F])/lc($1)/eg; # LES MISÚRABLES say;
Re: Case shifting on accented characters (casefold, fc)
by Anonymous Monk on Sep 13, 2013 at 08:43 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1053786]
Approved by Paladin
help
Chatterbox?
[Your Mother]: Don't think so.
[Your Mother]: Windows doesn't even recognize them as far as that goes.
[Lady_Aleena]: Your Mother, thanks. I had a feeling it would be no.
[Your Mother]: :P
[Lady_Aleena]: Your Mother, it is a good thing I am no on Windows.
[Lady_Aleena]: s/no/not/;
[Lady_Aleena]: My blasted web host will not put perl 5.22 in /usr/bin/perl, it is in /usr/local/cpanel/ 3rdparty/bin/perl

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2017-04-28 06:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I'm a fool:











    Results (519 votes). Check out past polls.