Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

How to remove duplicate records

by SatisfyMyStruggles (Initiate)
on Jun 09, 2013 at 08:26 UTC ( #1037915=perlquestion: print w/ replies, xml ) Need Help??
SatisfyMyStruggles has asked for the wisdom of the Perl Monks concerning the following question:

Can someone please show how to remove duplicate records. I read in a file with records that have the input record separator to set to:

$/ = "\n\n"; open FILE, "LogMessages.txt" or die $!; while(<FILE>) { }

Comment on How to remove duplicate records
Download Code
Re: How to remove duplicate records
by Corion (Pope) on Jun 09, 2013 at 08:28 UTC

    This is a FAQ. See perlfaq4 on "duplicate", or alternatively run

    perldoc -q duplicate
Re: How to remove duplicate records
by hdb (Prior) on Jun 09, 2013 at 09:06 UTC

    Use a hash to store the information whether or not a record has been seen already. Use the record as key.

    use strict; use warnings; $/="\n\n"; my %seen; while(<DATA>){ print unless $seen{$_}++; } __DATA__ a b b a
Re: How to remove duplicate records
by Old_Gray_Bear (Bishop) on Jun 09, 2013 at 22:04 UTC
    A non-Perl solution for use on Linux and other Unix-like systems:
    $ sort -u my.input.file

    ----
    I Go Back to Sleep, Now.

    OGB

Re: How to remove duplicate records
by rpnoble419 (Pilgrim) on Jun 10, 2013 at 04:14 UTC

    Before a proper solution can be identified, what does your data look like and what are the criteria for it to be a dupe? You may have hundreds of calls in you log file to a specific graphic, but if they all come from different ip address at different times, they are not a dupe. Also how big is your log file? A hash based dupe system may not work over millions of records.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1037915]
Approved by SamCG
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (10)
As of 2014-10-21 16:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (105 votes), past polls