Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

What is the best way to remove(delete) specific lines from a very large file?

by Flame (Deacon)
on Dec 01, 2001 at 03:27 UTC ( [id://128791]=perlquestion: print w/replies, xml ) Need Help??

Flame has asked for the wisdom of the Perl Monks concerning the following question:

  • Comment on What is the best way to remove(delete) specific lines from a very large file?

Replies are listed 'Best First'.
Re: What is the best way to remove(delete) specific lines from a very large file?
by IlyaM (Parson) on Dec 01, 2001 at 03:39 UTC
    Read file line by line and write lines you need to another file. Once finished replace old file with new.

    Here some code which looks a bit cryptic but check this node, perldoc perlrun and perldoc perlvar and it should be become clear.

    { local $^I = ''; local @ARGV = ($filename); while (<>) { next if CRITERIA_TO_SKIP_LINE; print; } }
Re: What is the best way to remove(delete) specific lines from a very large file?
by fundflow (Chaplain) on Dec 02, 2001 at 22:55 UTC
    Just to add to the above suggestions (and more that will surely follow), it is most likely a job for grep:
    cat big-file.txt | grep -v 'bad line' > new-file.txt

    Or, alternatively:

    perl -i.orig -pe '$_="" if /bad line/' big-file.txt
Re: What is the best way to remove(delete) specific lines from a very large file?
by tye (Sage) on Dec 03, 2001 at 22:03 UTC
    If the file is so big that you don't have enough disk space available for both the "before" and "after" versions, then you can go to quite a bit of extra work to save disk space:
    my $fileName= "toBeTrimmed.txt"; open( FILE, "+< $fileName" ) or die "Can't update $fileName: $!\n"; my $readPos= tell(FILE); my $writePos; while( <FILE> ) { if( wantLine($_) ) { $readPos= tell(FILE) or die "Can't tell after read: $!\n"; if( defined($writePos) ) { seek( FILE, $writePos, 0 ) or die "Can't seek to write: $!\n"; print FILE $_ or die "Failed writing to file: $!\n"; $writePos= tell(FILE) or die "Can't tell after write: $!\n"; seek( FILE, $readPos, 0 ) or die "Can't seek to read: $!\n"; } } elsif( ! defined($writePos) ) { $writePos= $readPos; } } if( defined($writePos) ) { truncate( FILE, $writePos ) or warn "Can't truncate: $!\n"; }
    But be sure to test this code on a subset of your file and make a backup of the file beforehand as a failure of this code will simply leave your file corrupted.

            - tye (but my friends call me "Tye")
Re: What is the best way to remove(delete) specific lines from a very large file?
by dws (Chancellor) on Dec 01, 2001 at 03:34 UTC
    What is the best way to remove(delete) specific lines from a very large file?

    Ah. The post with a succinct question!

    Read from file A, write anything you don't want to delete to file B. If the files are really large, it'll go faster to have the files on separate disk drives. Moving disk heads is relatively expensive. You can minimize head contention by reading from one head (file) while writing to a separate head (file). Unless, of course, the source file is fragmented, or the target disk is heavily fragmented.

Re: What is the best way to remove(delete) specific lines from a very large file?
by abhishek_akj (Initiate) on Aug 26, 2009 at 10:12 UTC
    use Tie::File; tie @array, 'Tie::File', "filename" or die "something is wrong: $! \n" +;
    But dont try to use scalar @file or something which tries to calculate the size of @array, as that will start loading the whole array in memory . moreover dont use foreach to traverse this array, as that internally calculates the size of array before traversing it, use while instead
Re: What is the best way to remove(delete) specific lines from a very large file?
by scottysb (Initiate) on May 14, 2008 at 10:15 UTC
    Here's an example of how you can just keep one file with lets say 10 lines. Once the file reaches the max size, new entries are put at the end and the first line is removed.
    #!/usr/bin/perl my $FILE = $ARGV[0]; my $MAXSIZE=10; exit -1 if ! -f $FILE; $num_lines = 0; open FILE, ">$FILE"; for (;;) { chomp($mytime=`date +%T`); my $line = "$mytime"; $line .= ";" . sprintf "%.2f", rand(10); $line .= ";" . sprintf "%.2f", rand(20); $line .= ";" . sprintf "%.2f", rand(30); $line .= ";" . sprintf "%.2f", rand(40); if($num_lines < $MAXSIZE){ push @array, "$line\n"; $num_lines++; }else{ shift @array; push @array, "$line\n"; } #Get the end position of the file seek(FILE, 0, 2); $endpos = tell(FILE); seek(FILE, 0, 0);#Seek to start print FILE @array; #For the rest of the bytes print " " $curpos = tell(FILE); for(my $cnt=$curpos;$cnt<$endpos;$cnt++){ print FILE " "; } sleep 1; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://128791]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (3)
As of 2024-04-20 14:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found