Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

One way is almost what you have, but with some changes,

use strict; my (@keywords, %keyword)=qw/foo bar 12345 abcd/; my ($string, %result) = "foobarfoo1234523423412345abcdefsadfabc";
Precompile the regexen, @keyword{@keywords} =  map {qr/\Q$_\E/} @keywords;
Now get the count directly without any named temporary,
$result{$_} = () = $string =~ $keyword{$_} for (keys %keyword);
That's no big change over what you have, but uses some idiomatic optimizations.

Another way is to count within the big regex you mention. You can do that with a code construction in the re,

my @regexen = map { qr/(?:\Q$_\E(?{$result{$_}++}))/ } @keywords; my $re = do { local $" = '|'; qr/@regexen/; };
I've used my favorite tricky way of getting alternation into an array there (qr// is an interpolating quote operator).
$string =~ /$re/g; print "$_: $result{$_}\n" for @keywords;
There, the regex engine should only evaluate the code part if the text has matched, and then restart the regex at pos for the next match. Untested,

Update: Perl qr// dosn't seem to like running the (?{$result{$_}++}) bit. I'm not sure why. Anybody know?

A third way is to munch through the string with index for each word you want to match.

It may be worthwhile to study your text before running the regex matches on it. Benchmark your different approaches, chances are that each will be best for some cases.

After Compline,
Zaxo


In reply to Re: Count multiple pattern matches by Zaxo
in thread Count multiple pattern matches by johnnywang

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others making s'mores by the fire in the courtyard of the Monastery: (3)
    As of 2019-12-15 00:13 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found

      Notices?