Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Delete duplicate data in file

by pg (Canon)
on Nov 21, 2005 at 04:37 UTC ( [id://510361]=note: print w/replies, xml ) Need Help??


in reply to Delete duplicate data in file

It looks like your rows are sorted by timestamps (I will assume this for the rest of this post), so if there are duplicates, they will be next to each other (could be more than 2 rows). All what you need to do is (the algorithm):

open the file; open a temp file for output; set $lastrow to ''; while (file not empty) { read one row; if (this row equals to the $lastrow) { do nothing; } else { write this row to the output file; set $lastrow to this row; } } close both files; copy the temp file to the original file;

The perl code would be close to this:

use strict; use warnings; my $lastrow = ""; while (my $line = <DATA>) { $line =~ /(.*?)\n/; $line = $1; if ($line ne $lastrow) { print $line, "\n"; $lastrow = $line; } } __DATA__ data_row_051126120432.data data_row_051126120630.data data_row_051126120630.data data_row_051126122305.data data_row_051126122305.data

This prints:

data_row_051126120432.data data_row_051126120630.data data_row_051126122305.data

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://510361]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2024-03-28 12:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found