Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^3: Remove Duplicates from a mbox file

by Anonymous Monk
on Oct 11, 2007 at 03:20 UTC ( #644131=note: print w/ replies, xml ) Need Help??


in reply to Re: Re: Remove Duplicates from a mbox file
in thread Remove Duplicates from a mbox file

I couldn't get the perl code above to work right, so I kept searching and I found the one on the web site below, It seems to work great! It removed 2400 duplicates from a 200MB mbox file. It also automatically creates a backup for you. www.wdr1.com/hacks/mbox-dedup.pl


Comment on Re^3: Remove Duplicates from a mbox file
Re^4: Remove Duplicates from a mbox file
by Anonymous Monk on Oct 21, 2009 at 13:40 UTC
    Yes, but beware it will skip messages which do not have a Message-ID header - and they won't be stored in the resulting file, so you'll have to keep the backup file nevertheless. However, all messages which were skipped will be output.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://644131]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (9)
As of 2014-12-20 23:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (99 votes), past polls