Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

matchin a pattern 250 characters up and down stream

by Anonymous Monk
on Jun 27, 2012 at 20:52 UTC ( [id://978770]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hii all
i have a file a which contains some thing like this
0 yyx 3020
1 yyw 10,000
now i have another file (file2) which contains the characters say abceffredd.... starting from 1 to n positions i want to refer first file if 0 appears it should go to eg 3020 position in file 2 and see for a pattern match say $abc at 250 characters after 3020 (i,e., 3270). if 1 appears it shud go 250 characters before 3020 and search for the pattern $abc. what should be done for it. i think use of substring
$pos1=$pos-250; $seq250 = substr($sequence,$pos1,250);
would help me but not really able to derive it

Replies are listed 'Best First'.
Re: matchin a pattern 250 characters up and down stream
by muba (Priest) on Jun 27, 2012 at 23:39 UTC
    use strict; use warnings; # Warning! # This code is untested, as I don't have any representative test data # to work with. As such, it may contain bugs, logical faults, and othe +r # subtleties. However, it should get the idea along. # Warning! # Because your specifications were a little vague at best, I had to do # some guesswork. One of the things I guesses, was that by # > search for the pattern $abc # you meant you want to match against the pattern as stored in the # variable $abc, as opposed to matching against the literal sring '$ab +c', # that is, dollar, small letter a, small letter b, small letter c. # Another thing I had to guess was that those strings "yyx" and "yyw", # as from your sample input, are likely candidates to be the value of # this variable $abc. # This code is written with the above assumptions in mind. # Warning! # In your original post, you defined, more or less, what the needle # is that we're looking for ($abc), and what the haystack is we're # looking in (file2), but not what you want to happen when something # is actually found. my $length = 250; my @w00t = ("Roses are red,", " Violets are blue,", "This lame-ass po +em,", " doesn't rhyme."); # Let's open those files. open my $fhConditions, "<", "somethinglikethis.txt" or die "Epic Fail: + $!"; open my $fhCharacters, "<", "file2" or die "OMG Fail: $!"; while (my $line = <$fhConditions>) { chomp $line; my ($where, $abc, $position) = split(' ', $line); # Uncomment +for Case 2 $position -= $length if $where == 0; $position = 0 if $position < 0; seek $fhCharacters, $position, 0; my $data; read $fhCharacters, $data, $length; if ($data =~ m/$abc/) { print $w00t[rand @w00t], "\n"; } }
      Note to self: when muba uses warnings, muba really uses warnings! :)
      FILE1
      0 XYZ 5
      1 TTY 15
      0 MNU 10
      FILE2
      ABCDEFGHIJKLMNABCRSTUVWXYZ.....
      it shud read the file 1 get the position (3rd column) open another file (file2). from postion 5 it should go say 10 positions ahead if 0 appears in file 1(column1) and check whether $ABC is found. the same if 1 appears but it has to go 10 characters before the from position 5 in this example
      if match if found like this it should print Y other wise N
      open my $fhConditions, "<", "1.txt" or die "Epic Fail: $!"; use strict; use warnings; open my $fhCharacters, "<", "2.txt" or die "OMG Fail: $!"; my $length = 250; while (my $line = <$fhConditions>) { chomp $line; my ($strand, $chr, $position) = split(' ', $line); $position += $length if $strand == 0; $position -= $length if $strand == 1; seek $fhCharacters, $position, 0; my $data; read $fhCharacters, $data, $length; if ($data =~ m/$AAGCTT/) { print "Y" "\n"; else print "N" "\n"; } }
      this is the code i derived from urs.

      wen i am running it gives error like : Missing $ on loop variable at restriction.pl line 8.

        That's not what it says for me. For me it says,

        String found where operator expected at G:\x.pl line 15, near ""Y" "\n +"" (Missing operator before "\n"?) String found where operator expected at G:\x.pl line 17, near ""N" "\n +"" (Missing operator before "\n"?) Global symbol "$AAGCTT" requires explicit package name at G:\x.pl line + 14. syntax error at G:\x.pl line 15, near ""Y" "\n"" Execution of G:\x.pl aborted due to compilation errors.
        Which sounds about right. Let's go through those errors one by one, shall we?

        String found where operator expected at G:\x.pl line 15, near ""Y" "\n"" (Missing operator before "\n"?)

        Look at that. Perl is even being so helpful as to point out what's wrong with that line. The missing operator it's talking about, would be the comma operator. But why wouldn't you just write that as print "Y\n";?

        The next error message? I think you can figure that one out.

        Global symbol "$AAGCTT" requires explicit package name at G:\x.pl line 14.

        Yep. That regex is never going to work. In regular expressions, $ means "end of string". Surely nothing can appear after the end of a string, right? So Perl's being so liberal as to guess what you meant there, and it thinks you want to match against some variable $AAGCTT. This is why I have asked you whether you want to match against a variable or against a literal string. Turns out neither was the case and you meant ^ there all along, which means "beginning of string". Which would work.

        We asked for examples, so that we could understand your problem better. In an attempt to simplify your problem, you kept coming up with contrived sample data like abcdefgabcdefgabcdefg, but the monks here aren't afraid of a little DNA. You could've told us -or shown us, preferably- that you were dealing with data in which you needed to find a specific DNA sequence at a specific location.

        Anyway, I digress. The question we should be asking ourselves here is, "do you really need regular expressions?" Well, no. Because i's not really like you're pattern matching. Again, we could've caught this much sooner if you had just answered my question: "yes, muba, I do want to check if a literal string, "AAGCTT" to be precisely, appears at the calculated position in file 2." And I would've told you,

        "Well, AnonMonk. That's fairly easy. How about this?

        my $substring = "AAGCTT"; read $fhCharacters, $data, length($substring); if ( $data eq $substring) ) {
        If you run into any problems, let us know! Good luck."

        syntax error at G:\x.pl line 15, near ""Y" "\n""

        Perl's being so kind as to point to that line again, in case we missed it the first time around or anything. At the same time, it's not pointing out several other things wrong with your code, so let me do that for you.

        You should close your if block with a } before the else keyword, and similarily you should open your else block with a { after that keyword.

        From a readability point of view, the indentation of the else keyword is one level too deep. And while we're discussing indentation levels anyway, the my ($strand, $chr, $position)... line should go one level deeper. These are just cosmetic issues, of course, but in the long run you really profit from picking up the good habits right from the start.

        Speaking of good habits, your script should begin with the lines use strict; and use warnings;. You might want to move your top line down two lines.

        Apply these fixes, and let us know how that works out for you.

Re: matchin a pattern 250 characters up and down stream
by BrowserUk (Patriarch) on Jun 27, 2012 at 21:58 UTC

    You might consider trying to clarify your question. I've read it twice and I cannot work out quite what you are trying to do.

    A worked example -- using day 30 instead of 3000; 25 instead of 250 etc. -- might be the simplest clarification.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://978770]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-03-19 08:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found