Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: File::Find duplicate question

by crashtest (Curate)
on Oct 24, 2014 at 23:09 UTC ( #1104935=note: print w/replies, xml ) Need Help??

in reply to File::Find duplicate question

"Does it look correct?" Sure! But that's easy for me to say. Can't you test it on some examples and see if it gives you your expected output? If it doesn't, then you have a specific example where it goes wrong, and it would be easier to help you.

The program first groups files with identical sizes, then compares their MD5 hashes to identify duplicates, starting with the largest files first. So yes, it's checking file sizes first, so it can report the worst space hogs quickly for you. Seems like a nice feature.

As for getting Windows paths, a simple approach would be to do a substitution on the file name in line 28:

push @{$md5{Digest::MD5->new->addfile(*FILE)->hexdigest}}, $file =~ tr!/!\\!r ."\n"; # <-- instead of: $file . "\n"

Replies are listed 'Best First'.
Re^2: File::Find duplicate question
by Anonymous Monk on Oct 25, 2014 at 13:18 UTC
    Thanks for replying

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1104935]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2020-05-27 10:34 GMT
Find Nodes?
    Voting Booth?
    If programming languages were movie genres, Perl would be:

    Results (154 votes). Check out past polls.