http://www.perlmonks.org?node_id=1021313


in reply to Re^3: RegEx: Detecting the certain cyrillic words
in thread RegEx: Detecting the certain cyrillic words

Yes, I saved the script as utf-8. I'm using gedit 3.4.1 and character encoding is "Current Locale UTF-8"
Enough codes make shapes. (Hamidjon)
  • Comment on Re^4: RegEx: Detecting the certain cyrillic words

Replies are listed 'Best First'.
Re^5: RegEx: Detecting the certain cyrillic words
by choroba (Cardinal) on Mar 01, 2013 at 18:22 UTC
    Weird. Works for me:
    #!/usr/bin/perl use warnings; use strict; use utf8; use WWW::Mechanize qw(); my $mech = WWW::Mechanize->new; $mech->credentials('user' => 'pass'); $mech->get('http://kreml.ru/'); my $content = $mech->text(); $content =~ s/\n\r//g; binmode STDOUT, ':utf8'; print $1,"\n" if $content =~ /(\bКремл&# +1100;.*\"\b)(.*)/i;
    The regex looks like this in fact:
    $content =~ /(\bКремль.*\"\b)(.*)/i
    
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ