Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re: Pattern searching allowing for mis-matches...

by bv (Friar)
on Dec 13, 2009 at 17:33 UTC ( #812603=note: print w/replies, xml ) Need Help??

in reply to Pattern searching allowing for mis-matches...

You could probably do better using regular expressions with Global matching. Without completely understanding what you mean by "slight mismatches," (please explain!) I think something like this would accomplish your exact matches:

$matches = ($value =~ /$search/g);

UPDATE: using scalar where I did was wrong.

@_=qw; Just another Perl hacker,; ;$_=q=print "@_"= and eval;

Replies are listed 'Best First'.
Re^2: Pattern searching allowing for mis-matches...
by MaroonBalloon (Acolyte) on Dec 13, 2009 at 19:29 UTC
    Yes, the direct matches are not a problem. I am glad you could confirm the code. For the mismatch code I mean: I have a hash value which is a DNA sequence, example:

    If my % threshold was .75 for example, and if I was searching for TGAT, I would like for the program to tell me that there able for the program to find the identical match at position 1, AND identify the second match at position 5 as an "acceptable mismatch".

    With respect to your suggestion of regexp...Is it possible to type those in on the command line? Currently I use @ARGV[0] as my $search query, typically something like "ATC".
    Thanks! ER

      Your second question is much easier: Yes. Regular expressions can be built from any string, including those supplied by users. Generally, you should use quotemeta or the \Q and \E markers to make sure the string is free from regular expression meta characters like *, ., and more evil eval-type expressions. In your case, you could also check that the string is a valid nt sequence:

      my $string = quotemeta shift; die "Not a valid nucleotide sequence" if $string =~ /[^AGTC]/;

      As for the first question, one way would be to build a regex for each possibility. An example:

      my $string = "TGAT"; my @nts = map { my $tmp = $string; substr $tmp, $_, 1, '.'; $tmp; } (0 .. length ($string) -1); my $groupings = join '|', @nts; my $sample = "TGATTGGAATGTTAGAT"; while ( $sample =~ /($groupings)/go ) { print "Matched $1 ending at position ", pos $sample, "\n"; }

      @_=qw; Just another Perl hacker,; ;$_=q=print "@_"= and eval;

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://812603]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2020-02-18 19:26 GMT
Find Nodes?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?

    Results (79 votes). Check out past polls.