Thank you for asking the question, Chris, I'll watch the answers carefully.
I have a similar problem. I have written about a dozen modules in the recent past, but all of them were too specific to our company's needs to justify the idea of publishing them on the CPAN or elsewhere. But I wrote relatively recently a module for comparing large data files which I think may be useful to other users. This module can remove duplicate records, find and remove "orphans" (records in one file and not the other) and then compare the data. I have used that module in two different contexts, and it does what is needed and I think it works well. And I wrote it at home on my free time, so that I can really make it public domain or free software or whatever other adequate free access license.
My main problem, though, is that I do not know how to do a CPAN distri. Especially, I do not know how to provide test cases that will work with Unix files, Windows files, VMS files and so on.