Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: Delete duplicate data in file

by pg (Canon)
on Nov 21, 2005 at 04:37 UTC ( #510361=note: print w/replies, xml ) Need Help??

in reply to Delete duplicate data in file

It looks like your rows are sorted by timestamps (I will assume this for the rest of this post), so if there are duplicates, they will be next to each other (could be more than 2 rows). All what you need to do is (the algorithm):

open the file; open a temp file for output; set $lastrow to ''; while (file not empty) { read one row; if (this row equals to the $lastrow) { do nothing; } else { write this row to the output file; set $lastrow to this row; } } close both files; copy the temp file to the original file;

The perl code would be close to this:

use strict; use warnings; my $lastrow = ""; while (my $line = <DATA>) { $line =~ /(.*?)\n/; $line = $1; if ($line ne $lastrow) { print $line, "\n"; $lastrow = $line; } } __DATA__

This prints:

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://510361]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (8)
As of 2020-10-19 15:42 GMT
Find Nodes?
    Voting Booth?
    My favourite web site is:

    Results (205 votes). Check out past polls.