Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: search of a string in another string with 1 wildcard

by Anonymous Monk
on Jul 09, 2014 at 14:22 UTC ( [id://1092918]=note: print w/replies, xml ) Need Help??


in reply to search of a string in another string with 1 wildcard

Brute force approach:
my $pattern = 'abcdef'; my @regexes; for (my ($i, $len) = (0, length $pattern); $i < $len; ++$i) { my $regex = $pattern; substr($regex, $i, 1) = '.'; push @regexes, $regex; } say join "\n", @regexes;
Output:
.bcdef a.cdef ab.def abc.ef abcd.f abcde.
Something like that?

Replies are listed 'Best First'.
Re^2: search of a string in another string with 1 wildcard
by Anonymous Monk on Jul 09, 2014 at 14:27 UTC
    (should be "push @regexes, qr/$regex/", of course)
Re^2: search of a string in another string with 1 wildcard
by carolw (Sexton) on Jul 09, 2014 at 14:54 UTC

    yes and then, the string should be search with

    index(myString,@regexes);

    ?

    Doesn't seem to work.

      You need to loop through regexes. Or, maybe something like:
      ... push @regexes, $regex; } my $r = join '|', @regexes; $r = qr/($r)/; # compile the regex say "Regex is: $r"; # debug my $string_to_search = "djflsbcdefgkgjdslkgjabfoéabcdefg"; if ($string_to_search =~ $r) { say "Found it ($1) at position ", $-[0]; } # there is a useful magic variable @- (LAST_MATCH_START) # check perldoc for it
      Output:
      Regex is: (?^u:(.bcdef|a.cdef|ab.def|abc.ef|abcd.f|abcde.)) Found it (sbcdef) at position 4

        Works like a charm. Many thx

Re^2: search of a string in another string with 1 wildcard
by carolw (Sexton) on Jul 19, 2014 at 13:00 UTC

    To extend this problem to any number of wildcards (and not necessarily 1), would it be elegant and efficient to use the same code and change just

    substr($regex, $i, 1) = '.';

    to

    substr($regex, $i, m) = '.';

    where m will be the user's free parameter?

      carolw:

      Not quite. You're changing an $m character substring to a single char, so you could wind up with something like: .cdef, a.def, ab.ef, abc.f, abcd. where you're really wanting ..cdef, a..def, ab..ef, abc..f, abcd..; so you really want something a bit more like:

      substr($regex, $i, $m) = '.' x $m;

      But that's assuming you want your wildcards to be adjacent. If you want the wildcards to be anywhere, you've got a bit more work to do.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

        In effect, the wildcards could be anywhere.

        why in your previous code there is x before $m?

        So if the length of my string to be searched is about 100 chars or more which is not a lot but I have a very large number of strings in which I want to search another substring (the same in all) like millions or may be more and more than 1 non-adjacent wildcard will be permitted, will it be efficient in terms of time and memory to use your code (of course it should be adapted for more than 1 non-adjacent wildcard) or String::Approx?

        To extend to any number of non-adjacent wildcards, is it a good idea to put the 2 lines in a loop or would it be better to do in another way?

        for (j in 1:wildcards_nb){ #where wildcards_nb is the user's free para +meter substr($regex, $i, 1) = '.'; push @regexes, $regex; }

        Well, I have a large number of strings of 100-char length or more and would like to search for a substring with m wildcards of mismatch (m >= 1, user's free parameter) in all of the strings.

        So I started the thread with 1 wildcard but then, realized that the number of wildcards should be any number >=1.

Re^2: search of a string in another string with 1 wildcard
by carolw (Sexton) on Oct 12, 2014 at 14:43 UTC

    I would like to slightly change the question:

    How to modify the code, exactly '.' if the pattern to be matched is a fixed string and one character is not any character but can be a character in a set of characters at the same position:

    $pattern = 'abcdef';

    at the 3rd position, c could only be replaced by any character in the set {r,d,n,f,q,m}, for ex.

      One way:

      c:\@Work\Perl>perl -wMstrict -le "my $string = 'abcdef'; ;; my $pattern = qr{ [rdnfqm] }xms; ;; print qq{matched '$1' at offset $-[1]} if $string =~ m{ ($pattern) }xms; " matched 'd' at offset 3
      The construct  [rdnfqm] defines a "character class". Please see perlre, perlrequick, and perlretut.

        How do you find this?

        my @matches = ( qr/abcdef/, qr/abfdef/,#for simplicity, limited to 2 choices ); if ($line ~~ @matches){ ... }

        in fact, your code shouldn't be some thing like

        my $pattern = qr{ab[crdnfqm]def}; if $string =~ m{ ($pattern) }; ....

        which doesn't find all strings with the pattern.

        I don't understand trailing xms at the end of pattern.

        I think you should forget my previous solution as the user should type all the qr which is not convenient

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1092918]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (8)
As of 2024-03-29 08:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found