Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
I have a process that research in a text two words (#par1, $par2) that could be in $distance, and highlight it in a text. I have the following regulars expressions
$content =~ /\b($par1)(\W+(?:\w+\W+){1,$distance})?($par2)\b/i $content =~ /\b($par2)(\W+(?:\w+\W+){1,$distance})?($par1)\b/i
In fact is the same expression but with the words switched. I don't remember why I made separate but in the program where I use it works correctly. There's some reason but I'm not sure. Suddenly today the program starts to freezes and I found was because the second expression. The example that mades perl could not exit the evaluation of second regexp is:
The words where "abbott" and "salud" and the maximum distance was 20.
And the text was the following
La mitad de las personas con VIH requiere de una atención psicológica + y emocional derivada del impacto del diagnóstico o de las consecuen +cias de la propia infección, una cifra que dobla a la de la població +n general, según las conclusiones de las IV Jornadas de Divulgación +sobre VIH que han reunido a unos doscientos profesionales, pacientes + y estudiantes en el hospital Reina Sofía de Murcia. En el congreso +, organizado por Amuvih en colaboración con el servicio de Proyecció +n Social y Voluntariado de la Universidad de Murcia y Abbott, ha det +erminado que las personas que viven con VIH demandan especialmente a +tención a su salud mental para mejorar su calidad de vida, "una asig +natura pendiente a pesar de los innumerables avances farmacológicos" +. Entre otros factores, las jornadas establecieron que en la situación +de "vulnerabilidad" de las personas con VIH influyen "el propio diag +nóstico, la comunicación de su situación a los allegados, el inicio +del tratamiento, las fluctuaciones a lo largo de la infección, la pé +rdida de salud y deterioro físico, así como los efectos adversos del + tratamiento". Igualmente, otros factores importantes son la pérdida de la motivació +n, el hastío, el estigma y rechazo, así como las nuevas parejas sexu +ales, los cambios familiares, laborales y sociales, entre otras cosa +s, que derivan en "riesgo de depresión mayor, trastorno distímico, t +rastorno por ansiedad generalizada o trastorno de pánico".
In fact the first expression makes a match but when the program process the second expression the program freeze. In fact, in the intention to process accents the exacts words that compare are:
$par1 = [a\xe0\xe1\xe4\xe2A\xc1\xc0\xc4\xc2]bb[o\xf2\xf3\xf6\xf4O\xd3\ +xd2\xd6\xd4]tt; $par2 = s[a\xe0\xe1\xe4\xe2A\xc1\xc0\xc4\xc2]l[u\xf9\xfa\xfc\xfbU\xda\ +xd9\xdc\xdb]d;
Note: To detect the correct number of words are between this two words I have to change the locale
use POSIX qw(locale_h); my $old_locale = setlocale(LC_CTYPE); setlocale(LC_CTYPE, 'ca_ES.iso885915@euro'); use locale;
Anybody think of any reason that could happen when in other cases the expression works correctly? You think the expression is wrong, or could be more simple? The code process the regular expression and if matchs after the content is highlighted. This the exact code
my $new_content = $content; if ($content =~ /\b($par1)(\W+(?:\w+\W+){1,$distance})?($par2) +\b/i){ my ($res1, $res2, $res3) = ($1, $2, $3); $new_content =~ s/$res1\Q$res2\E$res3/<$tag$class> $res1<\ +/$tag>$res2<$tag$class> $res3<\/$tag>/i; } if ($content =~ /\b($par2)(\W+(?:\w+\W+){1,$distance})?($par1) +\b/i){ my ($res1, $res2, $res3) = ($1, $2, $3); $new_content =~ s/$res1\Q$res2\E$res3/<$tag$class> $res1<\ +/$tag>$res2<$tag$class> $res3<\/$tag>/i; }
I have no idea how to resolve it, finally I have to disable this process temporally to highlight it because of freezing effect.

In reply to Problems searching and highlighting proximity words in a text by jrc

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-04-19 22:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found