Re: Unicode and regexes

by dakkar (Hermit)
on Oct 30, 2002

in reply to Unicode and regexes

The regexps, per se, don't need any change (I'm assuming Perl 5.8.0, since 5.6.x had some problems). You need to assure two things:

  1. that your strings are correctly encoded
  2. that Perl knows it

The first is a problem in itself, but a bit off-topic.

The second can be done in two ways:

  1. if the strings come from a filehandle, you can use something like open(FH, "<:utf8", "file") to tell Perl to treat data as utf-8 (or use the :encoding layer, see perldoc -f open
  2. otherwise (such as your example, from a dirhandle), use Encode; and $string=Encode::decode("utf-8",$string);

Re: Re: Unicode and regexes
on Oct 31, 2002
    and if I still use Perl 5.6.1?


Node Type: note [id://209172]
As of 2019-10-16
