Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: hash collision DOS

by tilly (Archbishop)
on Jun 03, 2003 at 03:35 UTC ( #262557=note: print w/ replies, xml ) Need Help??


in reply to hash collision DOS

It is quite easy to solve the problem permanently. Just use another fast structure, like a BTREE, and the possibility of pathological cases becomes impossible. Alternately one can use a hash, and then add logic to detect when a hash bucket has an abnormally long linked list, then switch it on the fly to a more robust structure.

The problem is that the average performance of hashes are very, very hard to beat, and any kind of paranoid safety logic will be hit all of the time, slowing Perl down a bit on virtually every step. Odds are that nobody in the wild has actually been hurt by Perl's hashing behaviour (yet). I guarantee that people will wonder why Perl slowed down overall. When do you make the engineering decision that average case performance matters more this time than worst case performance?


Comment on Re: hash collision DOS
Re: Re: hash collision DOS
by BrowserUk (Pope) on Jun 03, 2003 at 04:17 UTC

    I commented on a possible 'fix' for the potential problem with perls current hashing function at Re: Hash Clash on purpose and would be interested on your assessment of the notion.

    If that isn't a fix, or cannot be implemented without undue penalty, I wonder if there isn't at least some mileage in having a new warning, along the lines of the 'Deep recursion' warning, that simply notes that a hash is exhibiting anomolous or 'worst case' behaviour at runtime. It wouldn't fix anything, but its presence in the logs could go a long way in trying to track down such behaviour down?

    Maybe a corresponding pragma that allowed the 'hash quality' threshold that would trigger the warning to be specified?


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


      I would have to examine the hash function in some detail to tell whether or not that would work as a fix (at a minimal performance loss and the space overhead of having to store the initial value for each hashing function in the hash).

      Deep recursion checks in the list traversal logic is exactly the solution that I thought was likely to slow things down. Perl spends a lot of its time doing hash lookups, and any overhead (eg having to use up an extra register to track how deep the list went) is likely to show performance problems.

      But thinking about the issue, I am not sure that a clever person couldn't manage to add such a check in some way that didn't cause significant performance problems. I simply have not worked close enough to the metal enough to have a sense of that. If you could add the performance check, though, there is no reason that you couldn't just fall back into a more complex and robust data structure instead of a linked list. The immediate algorithmic overhead is paid for by the fact that you only use it when you really need it.

        perl spends a lot of its time doing hash lookups

        True, but my (unstated) thought was that such 'hash quality' check would only need to be done at STORE time, not FETCH time. As such, the costs would be a significantly smaller proportion of the overall operation than with the FETCH, as it would be along side the memory allocation, bucket shuffling etc.?


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://262557]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2014-10-26 01:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (149 votes), past polls