Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^2: Regex: finding all possible substrings

by sauoq (Abbot)
on Jun 01, 2012 at 00:01 UTC ( #973643=note: print w/ replies, xml ) Need Help??


in reply to Re: Regex: finding all possible substrings
in thread Regex: finding all possible substrings

and see if it matches some permutation

Of course, we're all kind of guessing here because that isn't exactly a well-written spec... but I think assuming he wants all permutations is a bit of a leap. I think it may also be a leap to assume he only wants to search for strings of a certain length. He just says that they are "user defined."

I still think this is the right approach though, he just has to track the lengths of his needles and be sure to check substrings of each necessary length (so long as it doesn't go off the end of his haystack.)

Something like this:

#!/usr/bin/perl use warnings; use strict; my $haystack = shift; my %needles; undef @needles{@ARGV}; my @len = sort {$a<=>$b} keys %{{ map { (length,0) } keys %needles }}; my $pos = 0; my $hlen = length $haystack; while ($pos + $len[0] <= $hlen) { for my $L (@len) { last if $pos + $L > $hlen; my $substr = substr($haystack, $pos, $L); $needles{$substr}++ if exists $needles{$substr}; } $pos++; } use Data::Dumper; print Dumper \%needles;

Update:

$ ./973643.pl 'AAAAA AAACACA CAACAAA' AAA AAC ACA CAA $VAR1 = { 'AAC' => 2, 'ACA' => 3, 'CAA' => 2, 'AAA' => 5 };

-sauoq
"My two cents aren't worth a dime.";


Comment on Re^2: Regex: finding all possible substrings
Select or Download Code
Re^3: Regex: finding all possible substrings
by davido (Archbishop) on Jun 01, 2012 at 00:17 UTC

    Of course, we're all kind of guessing here because that isn't exactly a well-written spec...

    That's a pretty good assessment, and I appreciate being granted a little latitude in my interpretation thereof. About the best I could do was guess, and then do a better job than the OP of documenting the criteria I came up with. :) When posts like this come up I have to make a decision whether to take a stab at trying to guess at a more refined specification, or to post a node seeking clarification (which often never comes), or to just let the question go and continue about the work that the question was distracting me from in the first place.

    Sometimes what tips the scales for me is if I find a little amusement in the diversion of coming up with a solution to the specification that I invented by venturing a guess. My hope is that it's also helpful.


    Dave

      or to just let the question go and continue about the work that the question was distrcting me from in the first place.

      Ha! . . . ++

      -sauoq
      "My two cents aren't worth a dime.";

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://973643]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (15)
As of 2014-07-24 18:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (163 votes), past polls