reading two files in parallel

baxy77bax has asked for the wisdom of the Perl Monks concerning the following question:

a very basic question. how to read two files at the same time. Example:

file a      file b

1 read a line from a
2 read a line from b 
3 read a line from a
4 read a line from b
...
[download]

since the files are too large and i have not enough memory to store my files i am forced to do something like this but i hav realized that i don't know how to do this. If i have two nested while loops like this:

while(<F1>){
  #read a line
  while(<F2>){
     #read a line
     last; # go back to the main loop, but how to continue from this p
+oint in the next iteration ?
  }
}
[download]

how do i continue from where i stopped in the nested loop? Thank you

baxy

UPDATE: Thnx moritz ! that did it :)

Comment on reading two files in parallel Select or Download Code

Replies are listed 'Best First'.
Re: reading two files in parallel by moritz (Cardinal) on May 02, 2013 at 10:17 UTC
Something like this? `while (1) { my $a = <F1> or last; my $b = <F2> or last; # use $a and $b here }` [download] Perl 6 - the future is here, just unevenly distributed	[reply] [d/l]
Re^2: reading two files in parallel by morgon (Priest) on May 02, 2013 at 14:49 UTC
This is a nice way to do it, however I would alaways advice against using $a and $b as variable names as they are "magic" names for sorts and bugs where your variables are then shadowed in a sort can be hard to track down, so better avoid potentially troublesome names.	[reply]
Re: reading two files in parallel by Laurent_R (Canon) on May 02, 2013 at 11:57 UTC
Nested loops will not give you what you need. Besides Moritz's solution, you could also use one loop on one of the files (but not on the other): `while (my $a = <F1>) { my $b = <F2>; # do something with $a and $b }` [download]	[reply] [d/l]
Re: reading two files in parallel by LanX (Saint) on May 02, 2013 at 19:32 UTC
`my ($a,$b); while( $a=<F1>, $b= <F2>, $a or $b) { ... }` [download] reads till the end of the longer file. Changing to `and` limits to shorter one. Cheers Rolf ( addicted to the Perl Programming Language) UPDATE Please note Since $a and $b are not chomped, no normal input line should ever be false and hence not necessarily tested with `defined`. Better take care if you are using special filehandles allowed to return a simple 0 or null strings!!! UPDATE safer: `use strict; use warnings; use Data::Dump qw/pp/; open my $f1, "<", \ join "\n", 1..2; open my $f2, "<", \ join "\n", 1..5; while( defined (my $a=<$f1>) + defined (my $b=<$f2>) ) { $a .=""; $b .=""; chomp($a,$b); print "$a,$b\n"; }` [download] out `1,1 2,2 ,3 ,4 ,5` [download] In boolean context: `+` is like `or`, `*` is like `and`, just w/o short circuit.	[reply] [d/l] [select]
Re^2: reading two files in parallel by Laurent_R (Canon) on May 02, 2013 at 22:06 UTC
Beautiful idea, I did not think of this way of doing it, thank you, Rolf, it might make my module simpler... if I finally end up doing it.	[reply]
Re: reading two files in parallel by sundialsvc4 (Abbot) on May 02, 2013 at 12:36 UTC
As a slight note, Moritz’s solution as-written does not seem to consider what to do with the leftover records in the longer of the two files. The necessary additions, to be placed after what is shown, are trivial ... the only trick being to ensure that the first leftover record is processed.
Re^2: reading two files in parallel by moritz (Cardinal) on May 02, 2013 at 19:12 UTC
The necessary additions, to be placed after what is shown, are trivial No. The trivial additions are most certainly wrong. If F2 is exhausted first, the last line read from F1 inside the loop is lost, because `$a` is scoped to the block, and last leaves that block. Perl 6 - the future is here, just unevenly distributed	[reply] [d/l]
Re^2: reading two files in parallel by karlgoethebier (Abbot) on May 02, 2013 at 17:48 UTC
«As a slight note, Moritz’s solution as-written does not seem to consider what to do with the leftover records...» May be this is true. But what is your solution? Regards, Karl «The Crux of the Biscuit is the Apostrophe»	[reply]
Re^2: reading two files in parallel by Laurent_R (Canon) on May 02, 2013 at 18:55 UTC
This thing is easy if you know that each file will have an exact match of records. Much less easy if you can have missing lines in one of the files or the other. I wrote a program doing comparison between to very large files, handling all the cases of records existing in one file and not in the other with all the special cases (file A finished before B, or the other way) is not really trivial. I am working on transforming this program into a module as generic as possible, but, unfortunately, it is not ready to be used.	[reply]
Re^2: reading two files in parallel by Anonymous Monk on May 02, 2013 at 23:45 UTC
Seem like this should have been a reply to moritz's post	[reply]

Back to Seekers of Perl Wisdom

UPDATE Please note

UPDATE