http://www.perlmonks.org?node_id=920224


in reply to Re: Using variables in regex search
in thread Using variables in regex search

Thanks, Tanktalus!

I'm looking for a literal string. But I would like to find all occurrences of it, not just the first one (index gives only the first match).

A regex search could iterate over all matches. But I think I have to escape the contents of $str, because it's not a regex expression itself...

Replies are listed 'Best First'.
Re^3: Using variables in regex search
by Tanktalus (Canon) on Aug 14, 2011 at 14:00 UTC

    Now it's sounding like an XY Problem. What do you need to do with all found occurrences? Modify them? With a literal string, I can only think of three things to do with it: check if it's there (boolean - a single match is sufficient), modify it (s/../../), or count it. And the vast majority IME is the first one. Only the modification one "needs" a regex, and even that isn't really true.

    That all said, index can find multiple matches as well:

    $ perl -lE ' my ($haystack,$needle)=@ARGV; my $i=0; my @found; while(-1 != (my $curidx = index $haystack, $needle, $i)) { push @found, $curidx; $i = $curidx+1 }; say "found at $_" for @found ' abcsdfabcasegabc abc found at 0 found at 6 found at 13
    If you're doing a modification, just use rindex - it's even easier, use rindex (though this will be a bit slower for longer strings with many matches).
    $ perl -lE ' my ($haystack,$needle,$new)=@ARGV; while(-1 != (my $curidx = rindex $haystack, $needle)) { substr $haystack, $curidx, length($needle), $new }; say "new string: $haystack" ' abcsdfabcasegabc abc foo new string: foosdffooasegfoo
    The only challenge with this method is if $new contains $needle in it - then it won't work.

    If you are going with the substitution and want to use a regex (probably safer once you escape it), use "\Q" before your string:

    s/\Q$str\E/$new/g; # since \E is at the end, it's not really required.
    Hope that helps.

      I would like to find the positions of all the strings in the source text. The strings mark special positions in the source that need to be processed.

      Perhaps that is the first case in your example, where finding the strings is enough.

      I haven't realized index can be used like that, to find multiple matches. Or at least I imagined it would need more code than that. I'm going to use that solution now, instead of a regex.

      Thank you for all the help!
      On a related note to my Edgar database search, is there a way to load a list of regex search patterns from a database?

      I want to convert this long list of REGEX matches into a list being sourced from a select statement on a database

      Is that possible?

      select CONCAT(DATA,'...') FROM TABLE

      which comes out like this:

      06054E...

      063679...

      06369N...

      06374V...

      06417Y...

      06418E...

      but I want to substitute this list into this regular expression.

      if ($lines=~ /(06054E...|063679...|06369N...|06374V...|06417Y...|06418E...|)/) print "$1\n";
      Another Way:
      ($str, $substr, $newsubstr) = @ARGV; #while ( -1 != ($current = rindex($str, $substr))) { # substr $str, $current, length($substr), $newsubstr; #} $i = 0; while ( -1 != ($current = index($str, $substr, $i)) ) { substr $str, $current, length($substr), $newsubstr; $i = $current + length($newsubstr); } $\ = $/; print $str;