Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^4: Problems searching and highlighting proximity words in a text

by jrc (Initiate)
on May 24, 2010 at 11:30 UTC ( #841370=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Problems searching and highlighting proximity words in a text
in thread Problems searching and highlighting proximity words in a text

Thanks for your solutions seems to work in that example and also and more I try. The $4 seems not to be necessary, at least in my case returns only three results. An example code that works with your suggestions:

#!/usr/bin/perl use strict; use warnings; use POSIX qw(locale_h); my $old_locale = setlocale(LC_CTYPE); setlocale(LC_CTYPE, 'ca_ES.iso885915@euro'); use locale; my @expressions; my @contents; my $content = qq{ Abbott test1 test2 salud }; push (@contents, $content); $content = qq{ salud test1 test2 Abbott }; push (@contents, $content); $content = qq{ Abbott test1 test2 test2 test2 test2 test2 test2 test2 test2 salud }; push (@contents, $content); $content = qq{ salud test1 test2 test2 test2 test2 test2 test2 test2 test2 Abbott }; push (@contents, $content); $content = qq{ salud test1 test2 test2 test2 test2 test2 test2 test2 test2 test2 +Abbott }; push (@contents, $content); $content = qq{ Abbott test1 test2 test2 test2 test2 test2 test2 test2 test2 test2 + salud }; push (@contents, $content); $content = qq{ salud test1 test2 test2 test2 test2 test2 test2 test2 test2 test2 +test2 test2 test2 test2 test2 test2 test2 test2 test2 test2 Abbott }; push (@contents, $content); $content = qq{ salud test1 test2 test2 test2 test2 test2 test2 test2 test2 test2 +test2 test2 test2 test2 test2 test2 test2 test2 test2 Abbott }; push (@contents, $content); $content = qq{ salud test1 test2 test2 test2 test2 test2 test2 test2 test2 test2 +test2 test2 test2 test2 test2 test2 test2 test2 test2 test2 Abbott }; push (@contents, $content); $content = qq{ salud Abbott test1 test2 salud }; push (@contents, $content); my $par1 = '[a\xe0\xe1\xe4\xe2A\xc1\xc0\xc4\xc2]bb[o\xf2\xf3\xf6\xf4O\ +xd3\xd2\xd6\xd4]tt'; my $par2 = 's[a\xe0\xe1\xe4\xe2A\xc1\xc0\xc4\xc2]l[u\xf9\xfa\xfc\xfbU\ +xda\xd9\xdc\xdb]d'; my $expression = "$par1 $par2\:\:20"; push (@expressions, $expression); warn "PART 1"; foreach my $cont (@contents){ warn "CONTENT $cont"; foreach my $exp (@expressions) { my $tag = 'span'; my $class = "lighligth"; next if ($exp !~ /::/); my ($exp, $distance) = split("::", $exp); my ($par1, $par2) = split(' ', $exp); # warn "Pars $par1 - $par2 - $distance"; if ($cont =~ /$par1.*$par2/i) { if ($cont =~ /\b($par1)(\W+(?:\w+\W+){0,$distance})($par2)\b +/i) { # if ($cont =~ m/\b($par1)(\W+(?:\w*\W*){1,$distance})?($p +ar2)\b/i){ my ($par1, $par2, $par3, $par4) = ($1, $2, $3, $4); warn "FIND 1 Par1: $par1 Par2: $par2 Part3: $par3"; $cont =~ s/$par1\Q$par2\E$par3/<$tag$class> $par1<\/$tag> +$par2<$tag$class> $par3<\/$tag>/gi; } } warn "STEP"; if ($cont =~ /$par2.*$par1/i) { if ($cont =~ /\b($par2)(\W+(?:\w+\W+){0,$distance})($par1)\b +/i) { my ($par1, $par2, $par3, $par4) = ($1, $2, $3, $4); warn "FIND 2 Par1: $par1 Par2: $par2 Part3: $par3"; $cont =~ s/$par1\Q$par2\E$par3/<$tag$class> $par1<\/$t +ag>$par2<$tag$class> $par3<\/$tag>/gi; } } } warn "END"; warn "\n\n"; }


Comment on Re^4: Problems searching and highlighting proximity words in a text
Download Code
Re^5: Problems searching and highlighting proximity words in a text
by Krambambuli (Deacon) on May 24, 2010 at 12:05 UTC
    The $4 seems not to be necessary,

    Indeed, as long as you use

    (?:\w+\W+){0,$distance}

    instead of the expression I've used,

    (\w+\W+){0,$distance}

    there will be no extra match. I haven't done any benchmarking, but probably the lookahead is a bit better/faster anyway.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://841370]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (9)
As of 2014-09-01 08:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (299 votes), past polls