Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

What went wrong with this change in code?

by naturalsciences (Beadle)
on Nov 10, 2011 at 10:55 UTC ( #937328=perlquestion: print w/replies, xml ) Need Help??
naturalsciences has asked for the wisdom of the Perl Monks concerning the following question:

I created a code to sort out these lines of text that would have both search phrases in them. It did it's job without complains. Then I had a need to search lines thusly.Get a line(lineA) > get next line(lineB) > search next line for matches > if matches found print lineA followed by line B

In short I wanted to print line with matches and a line previous to this.

The original code that worked for finding matches in a line was like this (its a bit shortened version here)

open IN, 'sample.fa'|| die("Could not open file!"); open OUT, '>modsample.fa'|| die("Could not open file!"); while ($line = <IN>) { if ($line=~m/mamma/&&$line=~m/mia/) {print OUT "$line";} else {print ".";} #just to look at something while it works }

It found the lines and printed them out. when I modified the script to look for lines and print both matched lines and the one precedening. I got a blank output file. For some reason it could not do its matches. Here is how my modification looks like.

open IN, 'sample.fa'|| die("Could not open file!"); open OUT, '>modsample.fa'|| die("Could not open file!"); while ($line = <IN>) {$nextline=<IN>; if ($nextline=~m/mamma/&&$line=~m/mia/) {print OUT "$line$nextline";} else {print ".";} }

Can you explain it to me? And maybe offer some suggestions how to actually accomplish my goal

Replies are listed 'Best First'.
Re: What went wrong with this change in code?
by jwkrahn (Monsignor) on Nov 10, 2011 at 11:20 UTC

    In the original you are searching for two strings in the same line but in the new version you are searching for one string in one line and the other string in the next line.

    Also, you are using the high precedence || operator which means that the program will not die if open fails because you are testing the boolean state of the string 'sample.fa' or the string '>modsample.fa' which are always true.

    You probably want something like this:

    open IN, '<', 'sample.fa' or die "Could not open 'sample.fa' becau +se: $!"; open OUT, '>', 'modsample.fa' or die "Could not open 'modsample.fa' be +cause: $!"; my $previous; while ( my $line = <IN> ) { if ( $previous && $line =~ /mamma/ && $line =~ /mia/ ) { print OUT $previous, $line; } else { print "."; } $previous = $line }

      Thanks = looks like I had been really crosseyed with that. Good that I could borrow your eyes for a second. (Long time no scripting .. have to learn to pay more attention)

Re: What went wrong with this change in code?
by moritz (Cardinal) on Nov 10, 2011 at 11:21 UTC

    Once you've read a line from <IN>, it is "used up", and the next call to <IN> will return the line after it.

    So when you do

    while ($line = <IN>) {$nextline=<IN>;

    You're reading two lines, and so overall $nextline will only contain every second line in the file (all even-numbered lines if you start counting from 1).

    So if all the matches are on odd-numbered lines, you won't see anything ini the output.

    Try something like this instead:

    my $previous = ''; while ($line = <IN>) { if ($line =~ /mamma/ && $line =~ /mia/) { print OUT $previous; print OUT $line; } else { print "."; } } continue { $previous = $line; }

    That way $line sees all the lines in the input file, and you still have the previous line available inside the loop.

      Actually it is my intent to read and find matches only in every second line. The input line as a format like this. >headline1 linethatactuallycontainsthematchesiminterestedin1 >headline2 linethatactuallycontainsthematchesiminterestedin2 But thanks for mentioning. It might have been a great ocersight. Like the oversight I actually made (did not replace both lines with nextline)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://937328]
Approved by Corion
Front-paged by Corion
[stevieb]: thanks choroba for making the world right again by mentioning something as simple as a missing backslash ;)
[beech]: s/a/many a/\

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2017-07-25 23:21 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (383 votes). Check out past polls.