Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: RegEx: Detecting the certain cyrillic words

by daxim (Chaplain)
on Mar 01, 2013 at 15:38 UTC ( #1021285=note: print w/ replies, xml ) Need Help??


in reply to RegEx: Detecting the certain cyrillic words

#!/usr/bin/perl
use utf8;
use strict;
use warnings FATAL => 'all';
use WWW::Mechanize qw();

my $mech = WWW::Mechanize->new;
$mech->credentials('user' => 'password');
$mech->get('http://www.rambler.ru/');

my ($Кремль) = $mech->content =~ /(Кремль)/i;
You very likely want to use Web::Query to dissect your HTML instead of regex, or at least match against the HTML-stripped text version of the document.


Comment on Re: RegEx: Detecting the certain cyrillic words
Re^2: RegEx: Detecting the certain cyrillic words
by programmer.perl (Beadle) on Mar 01, 2013 at 16:55 UTC
    I wrote code as you show, but command line didn't give any result... instead of my ($&#1050;&#1088;&#1077;&#1084;&#1083;&#1100;) = $mech->content =~ /(&#1050;&#1088;&#1077;&#1084;&#1083;&#1100;)/i; I wrote print $1,"\n" if $mech->content =~ /(&#1046;&#1077;&#1083;&#1072;&#1102;.*)(\<.*)/i; Characters here not show as a Cyrillic.
    Enough codes make shapes.
Re^2: RegEx: Detecting the certain cyrillic words
by programmer.perl (Beadle) on Mar 01, 2013 at 17:10 UTC

    My whole code is, but there is no result:

    #!usr/bin/perl -w
    use utf8;
    use strict;
    use warnings FATAL => 'all';
    use WWW::Mechanize qw();

    my $mech = WWW::Mechanize->new;
    $mech->credentials('user' => 'pass');
    $mech->get('http://example.ru/');

    my $content = $mech->text();
    $content =~ s/\n\r//g;

    print $1,"\n" if $content =~ /(\bЖелаю.*\!\b)(.*)/i;

    Enough codes make shapes.
      Have you saved the script as utf-8?
      لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
        Yes, I saved the script as utf-8. I'm using gedit 3.4.1 and character encoding is "Current Locale UTF-8"
        Enough codes make shapes. (Hamidjon)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1021285]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (13)
As of 2014-10-01 13:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (19 votes), past polls