Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: non-exact regexp matches

by Abigail-II (Bishop)
on Jun 23, 2004 at 15:52 UTC ( [id://369086]=note: print w/replies, xml ) Need Help??


in reply to Re^2: non-exact regexp matches
in thread non-exact regexp matches

I don't see the connection between using the (?{ }) and (?!) to report all matches and your original question of finding "partial" matches.

But so, you want any regexp to match fuzzy. However, then your example is unclear - it's picking out positions in the regex (not in the string) to indicate where characters should be changed. Do you also want to be able to change special characters? Is it ok to introduce characters in the regex to make it match? (That would be easy, just add a | as the first character in the regex).

Abigail

Replies are listed 'Best First'.
Re^2: non-exact regexp matches
by vinforget (Beadle) on Jun 23, 2004 at 17:24 UTC
    I refined my question a little more. I have a string of letters [ACGTacgtNn] from which I want to find a particular instance of a regexp, let's say:
    /ACCAAC[ACGTacgtNn]{6}CTA[ACGTacgtNn]{1}ATG[ACGTacgtNn]{1,2}GATGTT/

    I can do this just fine, but what if I want to match the above regexp with a tolerance of 2 minmatches for single characters. Below I have an example:
    $buf =~ m/(A)(C)(C)(A)(A)(C)([ACGTacgtNn]{6})(CTA[ACGTacgtNn]{1})(A)(T +)(G)([ACGTacgtNn]{1,2})(G)(A)(T)(G)(T)(T)(?{ print $-[0]," ",scalar@-,"\n"; })(?!)/;
    this will print the position of the match in $buf, followed by 19 (the number of submatches). I want to be able to return a match from 17-19 submathes, not just all 19. Thanks. Vince
      Will this do?
      use re 'eval'; no strict 'refs'; if (/(A)?(C)?(C)?(A)?(A)?(C)?([ACGTacgtNn]{6})?(CTA[ACGTacgtNn]{1} +)? (A)?(T)?(G)?([ACGTacgtNn]{1,2})?(G)?(A)?(T)?(G)?(T)?(T)? (?(?{17 <= grep {defined $$_} 1 .. 19})|(?!))/x) { ... }

      Abigail

        Hi, I tested your program on the following sequence:
        ACCAACCGGATTCTAGATGCAGATGTTGAAGATT # works OK. Change the second C for a G: AGCAACCGGATTCTAGATGCAGATGTTGAAGATT # Doesn't work.
        I'm looking over your regexp... bu it'll take me a while to figure it out. Thanks for the help. Vince

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://369086]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-04-19 23:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found