perlquestion
frogsausage
<p>Hi, I've been reading you quite a lot for answers about so many questions and always found what I wanted, until now.</p>
<p>
I am trying to read two (very) long files in order to compare them in a smart way: checking if some elements (such as value=3.14 vs value="3.14") are swapped on the same line.
</p>
<p>
Also:</p>
<p>- there are a lot of lines that I will be willing to discard as soon as I read them. Therefore, I am trying not to store these in memory as each file can go way beyond 100 000 lines each.</p>
<p>- I might append one or more following line (starting with a +) to the previous line starting with a letter if: this first line doesn't match with the one in the other file, if one of the following isn't matching.</p>
<p>
Lines can be such as:</p>
<code>
ABC a b c value=3.14
+ value2="2.04"
</code>
<p>or</p>
<code>
ABC a b c value=3.14
+ value2="2.04"
+ value3=text
</code>
<p>
Right now, I am reading them in this very simple way:</p>
<code>
while (defined(my $lineA = <FILE_A>) && defined(my $line_b = <FILE_B>)) {
...
compare_line(lineA=$lineA, lineB=$lineB);
...
}
</code>
<p>
When running small test cases, it works really great (swap comparison etc.)
However, I have some glitches and I guess that the longest file doesn't have its line read when the end of the shorter file is reached. These glitches are that one of the line starting with a + is the start of a new line in my result print (while it should always be appended after my first line).</p>
<p>
I tried changing && to || but it got all messed up.
I am thinking of dealing the remaining part of the longest file after the end of the shortest one is reached, however it doesn't sound really clean.
</p><p>
Looking forward reading your thoughts and suggestions!
</p><p>
-F</p>
P.S: running Perl 5.8.8