Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Delete duplicate data in file

by pg (Canon)
on Nov 21, 2005 at 04:37 UTC ( #510361=note: print w/ replies, xml ) Need Help??


in reply to Delete duplicate data in file

It looks like your rows are sorted by timestamps (I will assume this for the rest of this post), so if there are duplicates, they will be next to each other (could be more than 2 rows). All what you need to do is (the algorithm):

open the file; open a temp file for output; set $lastrow to ''; while (file not empty) { read one row; if (this row equals to the $lastrow) { do nothing; } else { write this row to the output file; set $lastrow to this row; } } close both files; copy the temp file to the original file;

The perl code would be close to this:

use strict; use warnings; my $lastrow = ""; while (my $line = <DATA>) { $line =~ /(.*?)\n/; $line = $1; if ($line ne $lastrow) { print $line, "\n"; $lastrow = $line; } } __DATA__ data_row_051126120432.data data_row_051126120630.data data_row_051126120630.data data_row_051126122305.data data_row_051126122305.data

This prints:

data_row_051126120432.data data_row_051126120630.data data_row_051126122305.data


Comment on Re: Delete duplicate data in file
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://510361]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2014-11-28 00:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (191 votes), past polls