http://www.perlmonks.org?node_id=1104935


in reply to File::Find duplicate question

"Does it look correct?" Sure! But that's easy for me to say. Can't you test it on some examples and see if it gives you your expected output? If it doesn't, then you have a specific example where it goes wrong, and it would be easier to help you.

The program first groups files with identical sizes, then compares their MD5 hashes to identify duplicates, starting with the largest files first. So yes, it's checking file sizes first, so it can report the worst space hogs quickly for you. Seems like a nice feature.

As for getting Windows paths, a simple approach would be to do a substitution on the file name in line 28:

push @{$md5{Digest::MD5->new->addfile(*FILE)->hexdigest}}, $file =~ tr!/!\\!r ."\n"; # <-- instead of: $file . "\n"