Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Counting multiple matches of RegExp in single line

by New Novice (Sexton)
on Nov 27, 2004 at 11:57 UTC ( #410702=perlquestion: print w/replies, xml ) Need Help??
New Novice has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Enlightened Ones,

I am looking for a way to count the occurence of matches when the search term is contained several times in one line. I take it that storing a string in an array and to run them through a for-loop is only helpful, when the regexp only shows up once per line. In my case I know this not to be the case.

Presumably there is some internal variable of the regexp procedure that stores the rest of the string, that is following the (first) match. This could then be used like this:

#! C:\Programme\Perl\bin\ use warnings; use strict; our $count=0; our $teststring="Long test string \n that includes the word we search, + which is test (to test it)\n several times in one line like this: te +st test test"; our $residual=$teststring; while ($residual) { if ($residual=~"test") { $residual=$_; $count++; } } print $count;
to count the matches, including several ones within one line. Of course, $_ isn't the variable I am looking for.

Thanks in advance for any hints.

Replies are listed 'Best First'.
Re: Counting multiple matches of RegExp in single line
by gaal (Parson) on Nov 27, 2004 at 12:08 UTC
    Here's one way to do it:

    $count += () = $teststring =~ /(test)/g;

    This syntax may be a little crypric at first, so here's the breakdown:

    You can assign the results of a match to an array, like so: @matches = $teststring =~ /(test)/g;

    Note the use of a capturing match (we use (test) in parentheses), and the /g modifier to match more than once.

    Now if you have @matches, then using it in scalar context gives you how many matches there were, so this would work:

    $count += @matches;
    The method I'm suggesting simply does this without the temporay variable. You do need to force the match to take place in list context, otherwise it'll just return 1 on successful matches; that's what the () = bit achieves.

Re: Counting multiple matches of RegExp in single line
by gaal (Parson) on Nov 27, 2004 at 12:18 UTC
    And here's another way to do it (probably clearer):

    $count++ while $teststring =~ /test/g;

    Note that there's no need for parens this time.

      If you mean the parens in the regex, you don't need them using the other way, either. /PAT/g in list context, will imply parens around the whole match, if none are provided in the pattern.
      $count = () = $teststring =~ /test/g;
      will just work fine.

      This is documented in perlop, see the last sentence in this paragraph:

      The "/g" modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern.
        Hey, good stuff, thanks for the tip.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://410702]
Approved by Arunbear
[marto]: good morning all

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2018-05-21 07:41 GMT
Find Nodes?
    Voting Booth?