Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Limiting number of regex matches

by davido (Cardinal)
on Sep 25, 2012 at 21:17 UTC ( [id://995638]=note: print w/replies, xml ) Need Help??


in reply to Limiting number of regex matches

$count++ while $count < $limit && $str =~ m/\bdog\b/g;

Dave

Replies are listed 'Best First'.
Re^2: Limiting number of regex matches
by Marshall (Canon) on Sep 25, 2012 at 22:07 UTC
    I do like Dave's answer. But there is one thing that he didn't tell us which is that $count has to be initialized before calling this statement when using strict and warnings. $limit could have been a constant and I think that is the same.
    #!/usr/bin/perl -w use strict; my $str='dog dog horse dog cow dog pig dog'; my $count =0; # won't work unless $count intitialized to 0 # simple declaration of "my $count; or a "my" # variable within in this type of complex perl # statement doesn't work under strict and warnings. # Or at least I've not been able get this # kind of stuff to work before. # But given my caveats, this does work! # whether is is better or not is left to # the readers.. my $limit =3; $count++ while $count < $limit && $str =~ m/dog/g; print "$count dogs were counted\n"; my $str2 = "horse dog cow"; $count = 0; #needed for initialization $count++ while $count < $limit && $str2 =~ m/dog/g; print "$count dogs were counted\n"; __END__ 3 dogs were counted 1 dogs were counted
    Now one good thing about Dave's code is that it will stop after a certain number of matches - or at least I think that is what is going to happen (the match global gizmo can be modulated and monitored as it progresses). Whether or not that really matters depends upon the length of the string and the number of "dogs". I would claim that with less than 20 animals, it doesn't matter at all. Once the string gets much bigger than than that, well, it could matter. Software is art with science as a base, but there are exceptions to every "rule". What is the best in this application is just not known. That's why I show some examples of how to use Dave's code. Its a good idea and worthy of consideration, especially if the data set is very large.

      To be accurate, it does work with or without initializing $count to zero. But on the first iteration, the undefined $count will be treated as though it were zero, and a warning will be generated letting the programmer know that he probably should initialize $count to zero explicitly before using it in the context of a numeric comparison (assuming warnings are enabled, as they probably ought to be).

      As for why the example doesn't explicitly use strict, first it seemed that the OP already had a handle on how to declare lexical variables, and second, "Well, because it's a four-line one-line example program I concocted as an example in my Usenet PerlMonks article---duh!"

      (I hope the intended humor isn't lost in this post, your point is valid.)

      Oh, and you're correct; the process stops as soon as the $limitth match occurs, which is a good approach since it stops extra work from happening. Think of it as the difference between List::Util's first function, and the core's grep.


      Dave

        I would personally consider a "warning" as an error even if it is the first one.

        I don't see any really significant disagreement here. I ran your code and it it works. And you ran my code and it works.

        Ok, there are different assumptions as to data set size, etc. But at the end of the day, the OP got two great ideas and he/she can try them both and see what happens in the particular application.

        I claim mutual success! We both did a good job.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://995638]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-03-19 02:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found