http://www.perlmonks.org?node_id=1021301


in reply to Re: RegEx: Detecting the certain cyrillic words
in thread RegEx: Detecting the certain cyrillic words

My whole code is, but there is no result:

#!usr/bin/perl -w
use utf8;
use strict;
use warnings FATAL => 'all';
use WWW::Mechanize qw();

my $mech = WWW::Mechanize->new;
$mech->credentials('user' => 'pass');
$mech->get('http://example.ru/');

my $content = $mech->text();
$content =~ s/\n\r//g;

print $1,"\n" if $content =~ /(\bЖелаю.*\!\b)(.*)/i;

Enough codes make shapes.
  • Comment on Re^2: RegEx: Detecting the certain cyrillic words

Replies are listed 'Best First'.
Re^3: RegEx: Detecting the certain cyrillic words
by choroba (Cardinal) on Mar 01, 2013 at 18:11 UTC
    Have you saved the script as utf-8?
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      Yes, I saved the script as utf-8. I'm using gedit 3.4.1 and character encoding is "Current Locale UTF-8"
      Enough codes make shapes. (Hamidjon)
        Weird. Works for me:
        #!/usr/bin/perl use warnings; use strict; use utf8; use WWW::Mechanize qw(); my $mech = WWW::Mechanize->new; $mech->credentials('user' => 'pass'); $mech->get('http://kreml.ru/'); my $content = $mech->text(); $content =~ s/\n\r//g; binmode STDOUT, ':utf8'; print $1,"\n" if $content =~ /(\bКремл&# +1100;.*\"\b)(.*)/i;
        The regex looks like this in fact:
        $content =~ /(\bКремль.*\"\b)(.*)/i
        
        لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ