Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: matchin a pattern 250 characters up and down stream

by muba (Priest)
on Jun 27, 2012 at 23:39 UTC ( #978798=note: print w/replies, xml ) Need Help??

in reply to matchin a pattern 250 characters up and down stream

use strict; use warnings; # Warning! # This code is untested, as I don't have any representative test data # to work with. As such, it may contain bugs, logical faults, and othe +r # subtleties. However, it should get the idea along. # Warning! # Because your specifications were a little vague at best, I had to do # some guesswork. One of the things I guesses, was that by # > search for the pattern $abc # you meant you want to match against the pattern as stored in the # variable $abc, as opposed to matching against the literal sring '$ab +c', # that is, dollar, small letter a, small letter b, small letter c. # Another thing I had to guess was that those strings "yyx" and "yyw", # as from your sample input, are likely candidates to be the value of # this variable $abc. # This code is written with the above assumptions in mind. # Warning! # In your original post, you defined, more or less, what the needle # is that we're looking for ($abc), and what the haystack is we're # looking in (file2), but not what you want to happen when something # is actually found. my $length = 250; my @w00t = ("Roses are red,", " Violets are blue,", "This lame-ass po +em,", " doesn't rhyme."); # Let's open those files. open my $fhConditions, "<", "somethinglikethis.txt" or die "Epic Fail: + $!"; open my $fhCharacters, "<", "file2" or die "OMG Fail: $!"; while (my $line = <$fhConditions>) { chomp $line; my ($where, $abc, $position) = split(' ', $line); # Uncomment +for Case 2 $position -= $length if $where == 0; $position = 0 if $position < 0; seek $fhCharacters, $position, 0; my $data; read $fhCharacters, $data, $length; if ($data =~ m/$abc/) { print $w00t[rand @w00t], "\n"; } }

Replies are listed 'Best First'.
Re^2: matchin a pattern 250 characters up and down stream
by frozenwithjoy (Priest) on Jun 28, 2012 at 05:12 UTC
    Note to self: when muba uses warnings, muba really uses warnings! :)
Re^2: matchin a pattern 250 characters up and down stream
by Anonymous Monk on Jun 28, 2012 at 05:15 UTC
    0 XYZ 5
    1 TTY 15
    0 MNU 10
    it shud read the file 1 get the position (3rd column) open another file (file2). from postion 5 it should go say 10 positions ahead if 0 appears in file 1(column1) and check whether $ABC is found. the same if 1 appears but it has to go 10 characters before the from position 5 in this example
    if match if found like this it should print Y other wise N
Re^2: matchin a pattern 250 characters up and down stream
by Anonymous Monk on Jun 28, 2012 at 15:52 UTC
    open my $fhConditions, "<", "1.txt" or die "Epic Fail: $!"; use strict; use warnings; open my $fhCharacters, "<", "2.txt" or die "OMG Fail: $!"; my $length = 250; while (my $line = <$fhConditions>) { chomp $line; my ($strand, $chr, $position) = split(' ', $line); $position += $length if $strand == 0; $position -= $length if $strand == 1; seek $fhCharacters, $position, 0; my $data; read $fhCharacters, $data, $length; if ($data =~ m/$AAGCTT/) { print "Y" "\n"; else print "N" "\n"; } }
    this is the code i derived from urs.

    wen i am running it gives error like : Missing $ on loop variable at line 8.

      That's not what it says for me. For me it says,

      String found where operator expected at G:\ line 15, near ""Y" "\n +"" (Missing operator before "\n"?) String found where operator expected at G:\ line 17, near ""N" "\n +"" (Missing operator before "\n"?) Global symbol "$AAGCTT" requires explicit package name at G:\ line + 14. syntax error at G:\ line 15, near ""Y" "\n"" Execution of G:\ aborted due to compilation errors.
      Which sounds about right. Let's go through those errors one by one, shall we?

      String found where operator expected at G:\ line 15, near ""Y" "\n"" (Missing operator before "\n"?)

      Look at that. Perl is even being so helpful as to point out what's wrong with that line. The missing operator it's talking about, would be the comma operator. But why wouldn't you just write that as print "Y\n";?

      The next error message? I think you can figure that one out.

      Global symbol "$AAGCTT" requires explicit package name at G:\ line 14.

      Yep. That regex is never going to work. In regular expressions, $ means "end of string". Surely nothing can appear after the end of a string, right? So Perl's being so liberal as to guess what you meant there, and it thinks you want to match against some variable $AAGCTT. This is why I have asked you whether you want to match against a variable or against a literal string. Turns out neither was the case and you meant ^ there all along, which means "beginning of string". Which would work.

      We asked for examples, so that we could understand your problem better. In an attempt to simplify your problem, you kept coming up with contrived sample data like abcdefgabcdefgabcdefg, but the monks here aren't afraid of a little DNA. You could've told us -or shown us, preferably- that you were dealing with data in which you needed to find a specific DNA sequence at a specific location.

      Anyway, I digress. The question we should be asking ourselves here is, "do you really need regular expressions?" Well, no. Because i's not really like you're pattern matching. Again, we could've caught this much sooner if you had just answered my question: "yes, muba, I do want to check if a literal string, "AAGCTT" to be precisely, appears at the calculated position in file 2." And I would've told you,

      "Well, AnonMonk. That's fairly easy. How about this?

      my $substring = "AAGCTT"; read $fhCharacters, $data, length($substring); if ( $data eq $substring) ) {
      If you run into any problems, let us know! Good luck."

      syntax error at G:\ line 15, near ""Y" "\n""

      Perl's being so kind as to point to that line again, in case we missed it the first time around or anything. At the same time, it's not pointing out several other things wrong with your code, so let me do that for you.

      You should close your if block with a } before the else keyword, and similarily you should open your else block with a { after that keyword.

      From a readability point of view, the indentation of the else keyword is one level too deep. And while we're discussing indentation levels anyway, the my ($strand, $chr, $position)... line should go one level deeper. These are just cosmetic issues, of course, but in the long run you really profit from picking up the good habits right from the start.

      Speaking of good habits, your script should begin with the lines use strict; and use warnings;. You might want to move your top line down two lines.

      Apply these fixes, and let us know how that works out for you.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://978798]
[Mr. Muskrat]: (considered) Re^3: Please help with Regexp::Common needs to be reparented to Re^2: Please help with Regexp::Common
[Mr. Muskrat]: Is it just me or does that timeout issue seems to be happening more often lately?
[Corion]: Mr. Muskrat: I'm not sure if it really happens more often, but I don't exactly know either
[LanX]: yep
[LanX]: more often for some weeks now
[Corion]: I think I'll have to manually (as god) intervene with that node, as the simple reparenting didn't seem to fix the parent/child relationship of the nodes
[Corion]: I think I have an idea but I'll have to open a ticket with on that - hopefully I get to that on the weekend
LanX imagines a burning thorn bush

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (11)
As of 2017-01-19 16:22 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (170 votes). Check out past polls.