Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: problems matching umlauts in env vars

by allolex (Curate)
on Jul 23, 2004 at 07:20 UTC ( #376821=note: print w/ replies, xml ) Need Help??

in reply to problems matching umlauts in env vars

You need to define a locale that contains // for \w to include them. You need to do this even for UTF-8. UTF-8 is just a standard way of representing characters, not the set of characters that can make up words in a particular language.

use locale; use POSIX 'locale_h'; my $loc = 'de_DE.utf8'; # German locale, for example. Run 'locale -a' + to get the exact locale name setlocale(LC_CTYPE, $loc) or die "Invalid locale $loc";

Either that, or use this little trick off of my home node: [A-Za-z-] instead of \w :)

I probably should add that the German locale will likely not match '', since it does not exist in German. Maybe Dutch or French...

Damon Allen Davison

Comment on Re: problems matching umlauts in env vars
Select or Download Code
Replies are listed 'Best First'.
Re^2: problems matching umlauts in env vars
by december (Pilgrim) on Aug 02, 2004 at 04:37 UTC

    Thanks for your reply. I have set the locale now, and that solves at least this problem.

    German locale should be using the iso-8859-1 (or rather iso-8859-15) charset, which does contain an e with umlauts. Standard French language doesn't have umlauts, but Dutch (my native language) does. Either way, all Western European countries use the same charset, which should be iso-8859-15 (that's latin1 plus euro).

    The problem now is that I don't know which charset will be given to me in the request... Could be pretty much anything.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://376821]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (14)
As of 2016-05-31 13:19 GMT
Find Nodes?
    Voting Booth?