Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

How to restart a while loop that is iterating through each line of a file?

by cookersjs (Acolyte)
on Nov 28, 2016 at 21:51 UTC ( [id://1176744]=perlquestion: print w/replies, xml ) Need Help??

cookersjs has asked for the wisdom of the Perl Monks concerning the following question:

Greetings monks,
I am currently trying to solve a loop riddle that has bested me.
The problem: I am trying to take the first 16 lines of a single file, and populate those same 16 lines in 34 different (but incrementally named) files. My current code:

my $filename = "../../Annotation_output"; for (my $i=0; $i<=33; $i++) { my $file_count = $i; open(my $new_fh, '>>', ("$filename" . "$file_count" . ".vcf")); my $count = 0; while (<$fh>) { print $new_fh "$_"; $count++; if ($count == 16) { last; } } }
$fh is the file im trying to take the 16 lines from, FYI

I have had the first file (Annotation_output0.vcf) gain the 16 lines needed, but never any files afterward.

Common outputs have included having each of the 34 files gain 16 lines with the while loop never reseting (So Annotation_output0.vcf = lines 0-15, Annotation_output1.vcf = lines 16-31, Annotation_output2.vcf = lines 32-47, and so on), as well as seeing just output0.vcf gain 16 lines while output1.vcf gains the entirety of the while loop, never advancing past $i = 2

Any help is appreciated!

Replies are listed 'Best First'.
Re: How to restart a while loop that is iterating through each line of a file?
by choroba (Cardinal) on Nov 28, 2016 at 22:04 UTC
    You never rewind the $fh back, so no wonder it reads the next 16 lines in the next iteration of the loop. If $fh points to a real file, you can set its position back to the beginning using seek:
    seek $fh, 0, 0;

    If $fh is a pipe, socket, or similar, though, it would be better to read 16 lines from it into an array before entering the loop:

    my @sixteen_lines = map scalar <>, 1 .. 16; for my $i (0 .. 33) { open my $new_fh, '>>', "$filename$i.vcf" or die $!; print {$new_fh} @sixteen_lines; }

    You should probably consider the situation where the input file is shorter than 16 lines. Also, it's unlcear why you're opening the output files for appending.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      seek was the exact command I was looking for.

      Thanks very much! -cookersjs
Re: How to restart a while loop that is iterating through each line of a file?
by Discipulus (Canon) on Nov 28, 2016 at 22:09 UTC
    If the matter is just 16 lines from a file then read it in memory (an array, for example), like:

    # untested open my $read_fh, '<', 'source.txt' or die "$!"; my @lines; while (push @lines,<$read_fh>){ last if $. == 16; } foreach my $count (0..33){ open my $out, '>', 'afilename'.$count,'.log' or die "$! writing"; print $out $_ for @lines; close $out; }

    But in the case you really need to restart a loop over a file read you can use seek against the filehandle.

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: How to restart a while loop that is iterating through each line of a file?
by Laurent_R (Canon) on Nov 28, 2016 at 22:50 UTC
    This may not be the most efficient way to do it (seek would probably be more efficient, and using an array to store the 16 lines certainly more efficient), but if you opened your $fh input file handle within the for loop, then the file handle would be reset to the beginning of the file each time through the loop and you would get your first 16 lines appended each output file.

    As a side note, also note that when you're reading from a file handle, the $. special variable contains the number of the file line being read, so that you don't really need the $count variable, you can just check $. to stop your while loop.

    Finally this line:

    for (my $i=0; $i<=33; $i++) {
    might be written simpler as:
    for my $i (0..33) {
      It would be even better to concatenate the sixteen lines into a single string. The range operator (..) can be used with perl style 'for' loops and statement modifiers.
      se strict; use warnings; use autodie; my $file = "0\n1\n2\n3\n4\n5\n6\n7\n8\n9\n" ."10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n"; open my $fh, '<', \$file; my $filename = 'Annotation_output'; # Directory removed for testing my $data = ''; $data .= <$fh> for (1..16); for (0..33) { open my $new_fh, '>', "$filename$_.vcf"; print $new_fh $data; close $new_fh; }
      Bill
Re: How to restart a while loop that is iterating through each line of a file?
by Marshall (Canon) on Nov 28, 2016 at 23:28 UTC
    This kind of "for" loop: for (my $i=0; $i<=33; $i++) is not common in Perl. Try this:
    #!/usr/bin/perl use strict; use warnings; my $filename = '16lines.txt'; open (INFILE, '<', $filename) or die "unable to read the file with 16+ lines $!"; my @lines = <INFILE>; #read all lines into an array for my $file_no (0..33) #total 34 files?? Really? Maybe 0..31? { open (my $fh, ">", "$filename$file_no.vcf") or die "problem opening $filename$file_no.vcf Oops $!"; print $fh grep{defined}@lines[0..15]; # the grep on defined lines is in case there are less # 16 lines in the input file. I am sure there are other # with perhaps split() to accomplish the same thing. }
      for my $file_no (0..33) #total 34 files??
      Yes, I asked myself the same question, but the original post says 34 different files.
      I have a large file (~3.4 million lines) that I am trying to break into 100,000 line chunks. The file header is needed for all of the smaller files, which is why I wanted the first 16 lines populated over 34 files.

      Thanks for the suggestions monks! -cookersjs
        Ok, I think I see what you are trying to do. Here is one way of many to code this sort of thing:
        #!usr/bin/perl use strict; use warnings; my $nHeaderLines = 2; my $nDataLinesPerFile = 4; my @header; # the "Big File" is the DATA segment below, # maybe millions of lines... for (1..$nHeaderLines) # read header lines from big file { my $header_line = <DATA>; push @header, $header_line; } # divide the big file data into smaller files, # each with the initial header... my $nFile = 0; my $fileNameBase = "SmallerFile"; my $nDataLine = 99999; my $line; while ($nDataLine++, defined ($line = <DATA>)) { if ($nDataLine > $nDataLinesPerFile) # start new file { $nFile++; my $name = "$fileNameBase$nFile.txt"; open (OUT, '>', "./$name") or die "$!"; print OUT @header; $nDataLine=1; } print OUT $line; } print "Program Done!\n"; __DATA__ Header 1 Header 2 data 1 data 2 data 3 data 4 data 5 data 6 data 7 data 8 data 9
        PS: This code has no advance knowledge of how many smaller files will be created. Nothing like "34 files" is hard coded.
        The code creates "as many files as are needed". In the case above, there are 3 files. data lines 1,2,3,4 in one file, data lines 5,6,7,8 in another and the last 9th data line in a third file.
Re: How to restart a while loop that is iterating through each line of a file?
by Arunbear (Prior) on Nov 29, 2016 at 11:00 UTC
    For some tasks, writing a script can be overkill. You can do this in a shell:
    I am trying to take the first 16 lines of a single file
    There's a tool called head which does just that:
    $ head -n 16 lines.txt > temp.lines.txt
    and populate those same 16 lines in 34 different (but incrementally named) files
    Now use a loop and copy the saved lines to the output files:
    $ for i in $(seq 0 33); do cp temp.lines.txt "Annotation_output$i.vcf" +; done
    And remove the temp file:
    $ rm temp.lines.txt
Re: How to restart a while loop that is iterating through each line of a file?
by CountZero (Bishop) on Nov 29, 2016 at 07:00 UTC
    Or just populate the file once and then copy it 33 times? Only showing the 'copy' part here.
    use Modern::Perl qw/2015/; use File::Copy; my $source = 'D:\Perl\scripts\data0.txt'; my $destination = 'D:\Perl\scripts\data'; copy( $source, "$destination$_.txt" ) or die "Copy failed: $!" for ( 1 + .. 33 );

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: How to restart a while loop that is iterating through each line of a file?
by LanX (Saint) on Nov 29, 2016 at 13:41 UTC
    Your title says

    > ... that is iterating through each line of a file ...

    your post says

    > ... trying to take the first 16 lines of a single file ...

    that's confusing.

    I see three possible cases:

    • first chunk of 16 lines
    • successive chunks of 16 lines
    • overlapping chunks of 16 lines
    The second example in the following post shows an approach to generically solve all cases with just minor changes:

    Re^2: Grab 3 lines before and 2 after each regex hit (sliding window)

    HTH!

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

      Sorry about the confusion, should have double-checked my title made sense

      -cookersjs

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1176744]
Approved by Discipulus
Front-paged by CountZero
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (8)
As of 2024-03-29 14:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found