in reply to
From your previous question I got:
Each object entry requires at least one FETCH (to see if the object already exists), and one STORE...
My question is, Is this necessary? I don't know what you are trying do accomplish, but if memory is not an issue, and keeping up with SQUID is, then you want as fast a process as possible writing to your DB as SQUID runs. I haven't noticed if you've said anything about the time constraints on actually using this data... who or what is querying the datastructure after you build it? do they care if it takes longer to query? do they care if the order of complexity for a query is increased?
if not, then we can fix you in no time... just don't do the FETCH op. STORE everything, and when you go to read the data (A) query all the lines you need, (B) delete them, (C)build your tree and give it to the user, and (D) re-insert the tree while the user is playing with it.
also, you potentially could change the datastructure... if you can generate an md5 in constant time, and memory is not a problem, you could generate an array of the length of possible checksums, then you could read and write to it in constant time. more realisticly if memory is not an issue you could play with very large arrays, indexed on the first N chars of the checksum...
point is, instead of building something that scales up well, you could build something that doesn't scale at all, and is just always going to be bigger than your dataset. that'll always win for speed.