Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Re: problems matching umlauts in env vars

by allolex (Curate)
on Jul 23, 2004 at 07:20 UTC ( #376821=note: print w/replies, xml ) Need Help??

in reply to problems matching umlauts in env vars

You need to define a locale that contains // for \w to include them. You need to do this even for UTF-8. UTF-8 is just a standard way of representing characters, not the set of characters that can make up words in a particular language.

use locale; use POSIX 'locale_h'; my $loc = 'de_DE.utf8'; # German locale, for example. Run 'locale -a' + to get the exact locale name setlocale(LC_CTYPE, $loc) or die "Invalid locale $loc";

Either that, or use this little trick off of my home node: [A-Za-z-] instead of \w :)

I probably should add that the German locale will likely not match '', since it does not exist in German. Maybe Dutch or French...

Damon Allen Davison

Replies are listed 'Best First'.
Re^2: problems matching umlauts in env vars
by december (Pilgrim) on Aug 02, 2004 at 04:37 UTC

    Thanks for your reply. I have set the locale now, and that solves at least this problem.

    German locale should be using the iso-8859-1 (or rather iso-8859-15) charset, which does contain an e with umlauts. Standard French language doesn't have umlauts, but Dutch (my native language) does. Either way, all Western European countries use the same charset, which should be iso-8859-15 (that's latin1 plus euro).

    The problem now is that I don't know which charset will be given to me in the request... Could be pretty much anything.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://376821]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (12)
As of 2016-10-27 18:05 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (367 votes). Check out past polls.