in reply to Performance quandary
I'm a little surprised that nobody has mentioned you're dug in up to the axle with recursive calls of add_entry. What's more, the way it happens suggests that your database is expected to have an expandable number of columns, since you appear to be splitting out each directory as a $key.
As I understand it, you want to index a collection of URI's by an md5 digest. It looks like both the location and content are digested together. I don't grok exactly how much you need to digest of the paths. If you need to track moved files, perhaps you should digest files and locations seperately.
I think your scaling problems will disappear if you replace the pseudo:
sub yours (@) parse_args peel_off_an_extra_arg_with_sides or return call_yours_again_with_sideswith a refactoring:
use URI; sub ThisAddUri($uri) make_updateVectors map md5Nstuff uri(shift) updateDB thatmapor maybe better, make the db part just an error-resistant wrapper, and keep the data array prep a seperate call.
You will come out way ahead if you get rid of the recursive calls, however you do it.