Help needed in reading a very large file line by line

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Help needed in reading a very large file line by line by tobyink (Canon) on Feb 28, 2012 at 10:50 UTC
The example you linked to deals with files with fixed-length records - i.e. generally binary files. As you're talking about "line by line", I assume you're talking about a text-based file - e.g. plain text, HTML, CSV, etc. # Set the character which will be used to indicate the end of a line. # This defaults to the system's end of line character, but it doesn't # hurt to set it explicitly, just in case some other part of your code # has altered it from the default. local $/ = "\n"; # Open the file for read access: open my $filehandle, '<', 'myfile.txt'; my $line_number = 0; # Loop through each line: while (defined($line = <$filehandle>)) { # The text of the line, including the linebreak # is now in the variable $line. # Keep track of line numbers $line_number++; # Strip the linebreak character at the end. chomp $line; # Do something with the line. do_something($line); # Perhaps bail out of the loop if ($line =~ m/^ERROR/) { warn "Error on line $line_number - skipping rest of file"; last; } } [download] But we can make the above more concise, because Perl usefully defines a variable called `$_` which is used as a default variable in many cases; and a variable called `$.` which keeps track of the current line number. `# Set the character which will be used to indicate the end of a line. local $/ = "\n"; # Open the file for read access: open my $filehandle, '<', 'myfile.txt'; # Loop through each line: while (<$filehandle>) { # The text of the line, including the linebreak # is now in the variable $_. # Strip the linebreak character at the end. chomp; # Do something with the line. do_something($_); # Perhaps bail out of the loop if (m/^ERROR/) { warn "Error on line $. - skipping rest of file"; last; } }` [download]	[reply] [d/l] [select]
Re^2: Help needed in reading a very large file line by line by AnomalousMonk (Archbishop) on Feb 28, 2012 at 22:07 UTC
The example you linked to deals with files with fixed-length records... NB: Re: help reading from large file needed begins by briefly alluding to processing files with fixed-length records, but then continues with a detailed discussion, with example, of indexing a variable-length record file for rapid random access.	[reply]
Re^2: Help needed in reading a very large file line by line by Anonymous Monk on Feb 28, 2012 at 10:59 UTC
while(<FILEHANDLE>) is giving an out of memory error.	[reply]
Re^3: Help needed in reading a very large file line by line by jethro (Monsignor) on Feb 28, 2012 at 11:43 UTC
Just post your script (between <c> </c> tags) and we can tell you what is wrong.	[reply]
Re^3: Help needed in reading a very large file line by line by tobyink (Canon) on Feb 28, 2012 at 13:26 UTC
Chances are that either `$/` is set to something silly instead of `"\n"`, or your file has no line break characters in it (or at least, very long lines).	[reply] [d/l] [select]
Re^3: Help needed in reading a very large file line by line by Anonymous Monk on Feb 28, 2012 at 11:10 UTC
:) Impossible, see `$ perl -le " while(<FILEHANDLE>) " syntax error at -e line 1, at EOF Execution of -e aborted due to compilation errors.` [download] See How do I post a question effectively?	[reply] [d/l]
Re^3: Help needed in reading a very large file line by line by MegART (Novice) on Dec 01, 2014 at 17:22 UTC
Hi, I see this is an old thread, but still, I would like to share that I have expereciend something similar. I wanted to do a very simple search and replace on a huge ASCII file (around 4GB) using the magic filehandle <>. The thing is that I cannot use seek or whichever method that requires fixed length of records. Also my $/ is set to "\n" and I know that the lines are not incredibly long. Any ideas? Here is a piece of code: `my $fh = new FileHandle; @ARGV = ($file); open $fh, ">test.txt"; while ($line = <>) { $line =~ s/$search/$replace/g; print $fh $line; }` [download]	[reply] [d/l]
Re^4: Help needed in reading a very large file line by line by choroba (Cardinal) on Dec 01, 2014 at 17:29 UTC
Re: Help needed in reading a very large file line by line by choroba (Cardinal) on Feb 28, 2012 at 10:41 UTC
To read a file line by line, just use `while (<>) { # process the line contained in $_ }` [download] What do you mean by "does not seem to be working"?	[reply] [d/l]
Re^2: Help needed in reading a very large file line by line by Anonymous Monk on Feb 28, 2012 at 10:48 UTC
Opening the big file with the open statement and then using while(<FILEHANDLE>) gives an Out of Memory! error.	[reply]
Re^3: Help needed in reading a very large file line by line by Anonymous Monk on Feb 28, 2012 at 11:08 UTC
So the file doesn't contain lines?	[reply]
Re^3: Help needed in reading a very large file line by line by ddragosa (Acolyte) on Mar 20, 2015 at 08:25 UTC
It is giving "Out of memory" because in your environment (I suppose it is a UNIX based one) your settings for "ulimit -a" at "data(kbytes)" is less than the file's size; Try modifying the data parameter with a value larger than the file you are processing. If you can't do that, use Tie::File. Slower but not a memory hog user.	[reply]
Re: Help needed in reading a very large file line by line by trizen (Hermit) on Feb 28, 2012 at 15:06 UTC
For very long lines, you can try something like this: `open my $fh, '<', $filename; my $line = ''; my $track = 0; my $max_line_length = 1024; # or whatever while (defined(my $char = getc $fh)) { $line .= $char; if (++$track == $max_line_length or $char eq "\n") { print $line; $line = ''; $track = 0; } } close $fh;` [download]	[reply] [d/l]
Re: Help needed in reading a very large file line by line by CountZero (Bishop) on Feb 28, 2012 at 17:26 UTC
How is a "line" defined in this file? It seems the definition of "line" in your file is different from what Perl expects a line to be. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics	[reply]
Re: Help needed in reading a very large file line by line by Anonymous Monk on Feb 28, 2012 at 10:37 UTC
Also can somebody please explain the code in Re: help reading from large file needed in a more readable program. I am new to Perl, that looked to have a lot of deep Perl in it. But I couldn't understand any of it.	[reply]


Just another Perl shrine
	PerlMonks