Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^2: Count multiple pattern matches

by zejames (Hermit)
on Dec 07, 2004 at 10:18 UTC ( #412873=note: print w/ replies, xml ) Need Help??


in reply to Re: Count multiple pattern matches
in thread Count multiple pattern matches

I've had a look at your pretty solution (with qr). However, it currently does not work, for several reasons.

The first in the one you mention : it does not compile, because of qr and (?{...}) block. That is explained in perlre, about the (?{...}) block :

For reasons of security, this construct is forbidden if the regular expression involves run-time interpolation of variables, unless the perilous use re 'eval' pragma has been used (see re), or the variables contain results of qr// operator (see perlop/"qr/STRING/imosx").

In short, that is to prevent a external variable containing such a block to be executed in the regex engine. The solution is described in the doc : use re 'eval'

However, some other thing does not work yet : when the regex qr/(?:(\Q$_\E))(?{$result{$_}++})/ is matched, that does not set the %result hash entry, because at execution time, $_ is not what you expect. If you look at the regex created :

print $re; __DATA__ (?-xism:(?-xism:(?:(foo))(?{$result{$_}++}))|(?-xism:(?:(bar))(?{$resu +lt{$_}++}) )|(?-xism:(?:(12345))(?{$result{$_}++}))|(?-xism:(?:(abcd))(?{$result{ +$_}++})))

you'll see that $_ variable is not replaced by the keyword values. To my understanding, it is because (?{...}) contains code that will be interpreted later, so perl does not interpret in at that time, leaving it unmodified.

So, to do what we want here, I use (...) to catch a execution time the match and set the correct hash entry. There is a final problem : the name of the variable.

my $data = "foofdsfdsbar"; my @matches = $data =~ m/(foo)(?{print "($1,$2,$+)"})|(bar)(?{print "($1,$2,$+)" +})/g; __DATA__ (foo,,foo)(,bar,bar)

That is to say : $+ is the only one that surely contains the last pattern matched (BTW, I know about performance penalty when using it, but I won't care for now ;)

So, let's code it !

#!/usr/bin/perl use strict; use re 'eval'; my @keywords = qw/foo bar 12345 abcd/; my ($string) = "foobarfoo1234523423412345abcdefsadfabc"; our %result; my @regexen = map { qr/(?:(\Q$_\E))(?{$result{$+}++})/ } @keywords; my $re = do { local $" = '|'; qr/@regexen/; }; my @match = $string =~ /$re/g; print "$_: $result{$_}\n" for @keywords;

--
zejames


Comment on Re^2: Count multiple pattern matches
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://412873]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (11)
As of 2015-07-08 04:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (94 votes), past polls