http://www.perlmonks.org?node_id=209172


in reply to Unicode and regexes

The regexps, per se, don't need any change (I'm assuming Perl 5.8.0, since 5.6.x had some problems). You need to assure two things:

  1. that your strings are correctly encoded
  2. that Perl knows it

The first is a problem in itself, but a bit off-topic.

The second can be done in two ways:

  1. if the strings come from a filehandle, you can use something like open(FH, "<:utf8", "file") to tell Perl to treat data as utf-8 (or use the :encoding layer, see perldoc -f open
  2. otherwise (such as your example, from a dirhandle), use Encode; and $string=Encode::decode("utf-8",$string);

Replies are listed 'Best First'.
Re: Re: Unicode and regexes
by hotshot (Prior) on Oct 31, 2002 at 07:54 UTC
    and if I still use Perl 5.6.1?

    Hotshot