The regexps, per se, don't need any change (I'm assuming Perl 5.8.0, since 5.6.x had some problems). You need to assure two things:
- that your strings are correctly encoded
- that Perl knows it
The first is a problem in itself, but a bit off-topic.
The second can be done in two ways:
- if the strings come from a filehandle, you can use something like open(FH, "<:utf8", "file") to tell Perl to treat data as utf-8 (or use the :encoding layer, see perldoc -f open
- otherwise (such as your example, from a dirhandle), use Encode; and $string=Encode::decode("utf-8",$string);