in reply to Removing Duplicate Files
Dupseek is a pretty good Perl implementation of what you're after, which has been around for a while now. I've never tested in on a data set of this size, though.
Tim
|
---|
In Section
Seekers of Perl Wisdom