thanks, that will get binary dupes nicely. I wanted to start with duped filenames and leave binaries for a later pass - I find quite a bit of the music my kids leave around has different binary content but same filenames, but the vice-versa case happens too so I'll add md5 summing as an option. I also want to save the date and size as well as the path+name (hence the array).
The bit I've been most unclear on is how to go from finding the first file and setting
key1 => [ <path1> ]
to adding the data for the second instance:
key1 => [ <path1>, <path2> ]
but hopefully I can put the suggestions above into practise now