|Don't ask to ask, just ask|
Replace duplicate files with hardlinksby bruno (Friar)
|on Aug 10, 2008 at 19:37 UTC||Need Help??|
This is my first post here, so please feel free to redirect this to any other section if this is not the place where it belongs.
I am posting this little script seeking for your opinions in every aspect: design, layout, readability, speed, etc. It uses File::Find::Duplicates to find duplicate files recursively in a directory and, instead of just informing about them or deleting them, it creates hardlinks so that the disk space is freed but the files do remain. I wrote it to practice some of the things that I'm trying to learn, but I found it quite useful for my /home directory (I could free 2 GB!).
I thought that creating a hard link might be a better idea than deleting the file, as sometimes one wants a certain file to be under a certain path.Update 2: Link filtering, soft link / remove support and documentation. Here
It also helped me to find severe redundancies in some "dot directories". For instance, in a couple of icon packages, ~30% were duplicates with different names. In this case deleting them would have ruined the icon set, but creating hard links both freed space and kept the package functional.
I was also pleasantly surprised that it is quite fast. I haven't benchmarked it (I haven't read the Benchmark documentation yet), but it is sensibly faster than, for example, the fdupes program that comes with Ubuntu (and probably other Debian-based distros).
Here's the code for it:
Update: Subrutine fuse() changed following betterworld's suggestion.