Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Count number of lines in a text file

by Anonymous Monk
on Aug 18, 2014 at 21:34 UTC ( [id://1097908]=note: print w/replies, xml ) Need Help??


in reply to Count number of lines in a text file

the sysread problem can be corrected with a fairly simple change - just check for the last buffer for the end of line. sample program showing some of the variations:
# # demonstrate the sysread (and read) problem and correction. we reall +y only need a small # file to demonstrate the problem.... # use strict; my $test_file="sysread_test_file.txt"; # change name to test siz +e/length differences # # create the test file # sub create_test_file { return if -e $test_file; # we do not want to create +if it already exists.... open TOUT,">$test_file"; # # write a small file with an extra line missing the EOL. # for (my $line=0;$line<1000;$line++) { print TOUT "qwertyuiopasdfghjklzxcvbnm1234567890qwertyuiopasdf +ghjklzxcvbnm1234567890qwertyuiopasdfghjklzxcvbnm1234567890\n"; } print TOUT "qwertyuiopasdfghjklzxcvbnm1234567890qwertyuiopasdfghjk +lzxcvbnm1234567890qwertyuiopasdfghjklzxcvbnm1234567890"; # no EO +L! close TOUT; } sub test_while_variable { my $linecount=0; open TIN,"<$test_file"; while (<TIN>) { $linecount++; } close TIN; print "test_while_variable: $linecount\n"; } sub test_block_read($) { my $block_size=$_[0]; open TIN,"<$test_file"; binmode TIN; my ($data, $n); my $newlinecount=0; while ((read TIN, $data, $_[0]) != 0) { $newlinecount+=($data =~ tr/\012//); } close(TIN); print "test_block_read: $newlinecount\n"; } sub test_fixed_block_read($) { my $block_size=$_[0]; open TIN,"<$test_file"; binmode TIN; my ($data, $n); my $newlinecount=0; while ((read TIN, $data, $_[0]) != 0) { $newlinecount+=($data =~ tr/\012//); } close(TIN); $newlinecount++ if $data !~ /\012$/; print "test_fixed_block_read: $newlinecount\n"; } sub test_block_sysread($) { my $block_size=$_[0]; open TIN,"<$test_file"; my ($data, $n); my $newlinecount=0; while ((sysread TIN, $data, $_[0]) != 0) { $newlinecount+=($data =~ tr/\012//); } close(TIN); print "test_block_sysread: $newlinecount\n"; } sub test_fixed_block_sysread($) { my $block_size=$_[0]; open TIN,"<$test_file"; my ($data, $n); my $newlinecount=0; while ((sysread TIN, $data, $_[0]) != 0) { $newlinecount+=($data =~ tr/\012//); } close(TIN); $newlinecount++ if $data !~ /\012$/; print "test_fixed_block_sysread: $newlinecount\n"; } # # do the test # create_test_file; # create the test file if not already presen +t test_while_variable; test_block_read 4096; test_fixed_block_read 4096; test_block_sysread 4096; test_fixed_block_sysread 4096; exit 0;

Replies are listed 'Best First'.
Re^2: Count number of lines in a text file
by Anonymous Monk on Aug 29, 2014 at 15:37 UTC

    correction to the "fixed" routines in the demonstration code - it did not handle the "good" EOL case correctly because the "tr" removed the EOL and the original code thought the (good) EOL was missing as a result (I guess I learn something every day). this fix may not be perfect - I think it will return 1 for an empty file (untested) but I am sure there is a fix for that if someone really wants to do it.

    I will not be offended if someone finds yet another problem (besides the empty file problem) and comes up with a solution. that is what this site is for - to help people do a better job writing Perl code and to teach people little things about Perl that they may not have known or thought much about.

    sub test_fixed_block_read($) { my $block_size=$_[0]; open TIN,"<$test_file"; binmode TIN; my ($data, $n); my $newlinecount=0; my $block_ends_with_eol=0; while ((read TIN, $data, $_[0]) != 0) { $block_ends_with_eol=1 if (substr $data,-1,1) eq "\n"; $newlinecount+=($data =~ tr/\012//); } close(TIN); $newlinecount++ if (!$block_ends_with_eol); print ">>>test_fixed_block_read: $newlinecount\n"; } sub test_fixed_block_sysread($) { my $block_size=$_[0]; open TIN,"<$test_file"; my ($data, $n); my $newlinecount=0; my $block_ends_with_eol=0; while ((sysread TIN, $data, $_[0]) != 0) { $block_ends_with_eol=1 if (substr $data,-1,1) eq "\n"; $newlinecount+=($data =~ tr/\012//); } close(TIN); $newlinecount++ if (!$block_ends_with_eol); print ">>>test_fixed_block_sysread: $newlinecount\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1097908]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (7)
As of 2024-04-25 16:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found