Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Grabbing lines three-by-three from a file

by belden (Friar)
on Dec 10, 2001 at 23:47 UTC ( #130757=perlquestion: print w/ replies, xml ) Need Help??
belden has asked for the wisdom of the Perl Monks concerning the following question:

I'm cleaning up some bogus record creation dates in a database, and have a file that I need to read in three-line chunks. I know all the lines are the same length, so the seek works ok. My code:
#!/usr/bin/perl use strict; use warnings; while(<DATA>) { my $len = length($_); chomp; push ( my @three, $_ ); # Grab next two lines in file my $c; CALL: while (<DATA>) { chomp; last CALL if $c++ == 2; push ( @three, $_ ); } # Do stuff to process @three print $_,"\t" for ( @three ); print("\n"); # Rewind or stop eof(DATA) ? last : seek(DATA, -($len*3), 1); } __DATA__ 1 2 3 4 5 6

Output:
 1 2 3
 2 3 4
 3 4 5

Aside from adding a blank line to the end of my data (which is imho a silly workaround for a small problem), how do I get it to print the trailing triplet:
 4 5 6

Thanks -
blyman

setenv EXINIT 'set noai ts=2'

Comment on Grabbing lines three-by-three from a file
Download Code
Re: Grabbing lines three-by-three from a file
by mortis (Pilgrim) on Dec 11, 2001 at 00:00 UTC
    You could treat @three like it's a queue, primeing it to 3 lines, shifting off the 1st element, and pushing on the next line as the last element:
    #!/usr/bin/perl use strict; use warnings; sub getline { my $line = <DATA>; return undef unless $line; chomp $line; return $line; } my @three = ( &getline(), &getline(), &getline() ); my $line; do { print join("\t",@three),"\n"; @three = ( $three[1], $three[2], $line = &getline() ); } while($line);
Re: Grabbing lines three-by-three from a file
by joealba (Hermit) on Dec 11, 2001 at 00:11 UTC
    If you aren't too concerned with memory usage...
    my @data = (<DATA>); chomp @data; my $i = 0; while($i+2 <= $#data) { my @three = @data[$i..$i+2]; print "@three\n"; $i++; # Do stuff to process @three print $_,"\t" for ( @three ); print("\n\n"); }
Re: Grabbing lines three-by-three from a file
by runrig (Abbot) on Dec 11, 2001 at 00:36 UTC
    THREE: { my @three; push @three, scalar(<DATA>) or last THREE for 1..2; chomp @three; while (<DATA>) { chomp; push @three, $_; print "@three\n"; shift @three; } }
Re: Grabbing lines three-by-three from a file
by tadman (Prior) on Dec 11, 2001 at 00:42 UTC
    You could always haul the entire file in and chunk it out, or you could just re-work how you're using Perl:
    #!/usr/bin/perl -w use strict; use warnings; my @three; # Stoke it chomp ($three[0] = <DATA>); chomp ($three[1] = <DATA>); while(!eof(DATA)) { chomp ($three[2] = <DATA>); # Do stuff to process @three print join ("\t", @three), "\n"; # Rotate shift(@three); } __DATA__ 1 2 3 4 5 6
    This is simply using the shift function to rotate the array as you insert new data. Some notes on your implementation:
    • Don't put any "warm up" code inside the loop, put it before the loop. This eliminates the CALL sub-loop, and any associated ambiguity with last.
    • Don't name your loops for no reason. In this case, you are using last to escape a single loop, not several in a single call, so it is redundant.
    • You're backtracking within the file using seek, which seems a little overkill, since you already have the data in @three. Shuffle it around using array methods instead of reading it again and again needlessly.
    • Your use of the ?: operator is, while valid, kind of unfortunate. It is normally used in an inline capacity, such as where there is no room for an if. While it might take more room, using if will make it a lot clearer what you're doing.
    Still, points for 'use warnings' and 'use strict'.
      Don't name your loops for no reason.

      IMHO, there's nothing wrong with naming your loops, especially (but not only) when there are multiple or nested loops in the general vicinity. It just makes things more clear, and more self-documenting.

        The last function, by definition, bails out of the loop that it is currently inside. If there was any confusion as to where the last call would be tossing program flow to, a comment might help. The loop label indicates the start of the loop, not the finish, so you have to read backwards to find the label, then scan fowards to find the loop finish point.

        Why not this?
        while (something()) { # Lots of stuff, nested functions, etc. last if ($condition); # Move to Post-Processing # Lots more stuff } # Post-Processing more_stuff();
        Instead of:
        PROCESS: while (something()) { # Lots of stuff, nested functions, etc. last PROCESS if ($condition); # Lots more stuff } more_stuff();
        If the while() structure was very long, it might be easier to scan for a carefully worded comment.

        If you have a reason for putting loop labels in, by all means put them in. All I'm advocating is that there shouldn't be things in a program that are there for no reason.

        Or maybe I just don't like Perl programs which feel like:
        _10: print "Hello"; _20: goto _10;
Re: Grabbing lines three-by-three from a file
by merlyn (Sage) on Dec 11, 2001 at 02:10 UTC
    A variation of this is answered in "Re: Reading multiple lines?". Maybe that technique could be of use.

    For example, cutting and pasting and altering slightly:

    my @buffer; { push @buffer, scalar <IN>; redo unless eof(IN) or @buffer >= 10; ## process @buffer shift @buffer;## [was:] @buffer = (); redo unless eof(IN); }
    Change the 10 to 3 and you're nearly done.

    -- Randal L. Schwartz, Perl hacker

Re: Grabbing lines three-by-three from a file
by belden (Friar) on Dec 11, 2001 at 03:29 UTC
    Thanks to all who responded. I should have mentioned earlier that my file is 5meg in size with 169312 lines - larger than I want to slurp into an array all at once.

    It's been a bit of a trick to go through and figure out what the different replies have in common. The most obvious trick (to others, at least!) was $foo = <FH>; which apparently pops a single line off of FH, sticks it in $foo, and updates FH so the next read will return the next line in the file.

    I feel pretty silly; I'm used to code like this

    while ( $line = <FH> ) { # do stuff to $line }
    but I didn't realize I could $foo = <FH>; outside of a while-ing context. While this is a worthwhile thing to know, for me the larger lesson was using a different approach.

    tadman said it best: you could just re-work how you're using Perl. Indeed. I was thinking I needed to read three lines, process them, then skip backwards two lines and repeat the cycle. It seems that everyone else realized I just needed to prime my array by reading two lines, then push a third line onto the array, process the lot, shift the first one off, and start pushing and processing again.

    For me the neat part about SOPW and the various answers is finding out which answer (if any) the poster decided to go with. In my case, I'm going with the answer that looks the most like my current code - an adaptation of tadman's response. This isn't based off of benchmarks or how easily I can incorporate the changes into my existing code. Perhaps it just seems closest to my Perl idiolect (awkward pidgin that it is).

    #!/usr/bin/perl use strict; use warnings; my @three; # Stoke it chomp ($three[0] = <DATA>); chomp ($three[1] = <DATA>); while(<DATA>) { chomp ($three[2] = $_); # Do stuff to process @three print join ("\t", @three), "\n"; # Rotate shift(@three); } __DATA__ 1 2 3 4 5 6
    Thanks again- blyman
    setenv EXINIT 'set noai ts=2'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://130757]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (11)
As of 2014-12-22 14:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (119 votes), past polls