http://www.perlmonks.org?node_id=147156


in reply to Performance quandary

I'm a little surprised that nobody has mentioned you're dug in up to the axle with recursive calls of add_entry. What's more, the way it happens suggests that your database is expected to have an expandable number of columns, since you appear to be splitting out each directory as a $key.

As I understand it, you want to index a collection of URI's by an md5 digest. It looks like both the location and content are digested together. I don't grok exactly how much you need to digest of the paths. If you need to track moved files, perhaps you should digest files and locations seperately.

I think your scaling problems will disappear if you replace the pseudo:

sub yours (@)
    parse_args
    peel_off_an_extra_arg_with_sides or return
    call_yours_again_with_sides
with a refactoring:
use URI;
sub ThisAddUri($uri)
    make_updateVectors
        map md5Nstuff uri(shift)
    updateDB thatmap
or maybe better, make the db part just an error-resistant wrapper, and keep the data array prep a seperate call.

You will come out way ahead if you get rid of the recursive calls, however you do it.

After Compline,
Zaxo